Vol. 27, No. 5 October, 1943 


Journal of Applied Psychology 


EpITED BY: DONALD G. PATERSON, UNIVERSITY OF MINNESOTA 


Consulting Editors 


Pauw S. Acuities, Psychological Corporation; WALTER V. BINGHAM, A.G.O., War Department; 
Harotp E. Burtt, Ohio State University; AntHur 1. Gates, T. C. Columbia University; 
Joun G. Jenkins, University of Maryland; Irvinc LorGe, T. C. Columbia University; 
Quinn McCNEMAR, Stanford University; WILLARD C. OLSON, University of Michigan; 
James P. Porter, Ohio University; Epwarp K. STRONG, JR., Stanford University; 
Morris S. VITELES, University of Pennsylvania; Joseru Zubin, N. Y. Psychiatric Institute. 





Table of Contents 


A Practical, Clinical Test for Organic Brain Damage: H. Huyt 

Limitations in the Use of Intelligence Test Performance to Detect Mental 
Disturbance: A. MAGARET AND C. WRIGHT 

A Test Battery for Identifying Potentially Successful Naval Electrical 
Trainees: C. H. Lawsue, Jr. anp G. R. THorNTON 

Studies in International Morse Code. I. A New Method of Teaching Code 
Reception: F. S. Kerier 

Occupational Differences in Manipulative Performance of Applicants at a 
Public Employment Office: L. TEEGARDEN 

Where They Like to Work: Work Place Preference of 288 Electrical Workers 
in Terms of Music: W. A. Kerr 

The Seashore Measures of Musical Talent and Speech Skill: H. GrtK1nson .. 

Prediction of College Scholarship for Groups Having Effort Indices of Re- 
stricted Range: B. SAPPENFIELD 

A School of Nursing Selection Program: B. Criver 

An Experimental Study of the Pl (“Plodding”’) Characteristics of Persistence: 

P. P. Roacu 458 

A Reply to Dr. Luckiesh: M. A. TINKER 

News and Notes 

Book Review 

New Books, Monographs, and Pamphlets 





Published Bi-monthly by The American Psychological Association, Inc. 
With the Cooperation of The American Association for Applied Psychology 
Prince and Lemon Sts., Lancaster, Pa., and Northwestern University, Evanston, Illinois 


Entered as second-class matter, Augu'a 19, 1943, at the post office at Lancaster, Pa., under the Act of March 3, 1879. 
Copyright, 19.3, by The American Psychological Association, Inc. 








See. . Se Oe ee a ee 

















Journal of Applied Psychology 








Vol. 27, No. 5 October, 1943 











A Practical, Clinical Test for Organic Brain Damage * 
Howard F. Hunt 


University of Minnesota 


The detection of that type of intellectual deterioration or intellectual 
impairment symptomatic of damage to or disturbances of function of the 
human cerebrum such as occurs in general paresis, epileptic deterioration, 
alcoholic psychosis, Pick’s and Alzheimer’s diseases, senile dementia, and 
the like, is a routine clinical problem. This type of deterioration is often 
present in the face of good contact and rapport, high motivation, and 
effective cooperation on the part of the patient. It can be differentiated, 
both practically and theoretically, from the functional type of intellectual 
deterioration characteristic of chronic schizophrenic patients which is 
attributable in most cases to poor cooperation, low motivation, and poor 
contact on the part of the patient. 

The apparent rather than real nature of the functional type of de- 
terioration has been stressed both by Kendig and Richmond (5) and by 
Wesley (11). Inastudy of 500 cases of schizophrenia, the former authors 
found, on the basis of school achievement of the subjects and their per- 
formance on the Babcock Mental Efficiency Test that, though the very 
low scores of some patients suggest at least temporary impairment of 
function, these low scores are frequently due to lack of cooperation and 
inattention. If the examiner can break through the barrier of psychotic 
preoccupation and establish rapport, the test results may be considered 
to be fairly representative of the patient’s prepsychotic level. Wesley, 
following a review of this and several other studies, concludes that, ““We 
may then refer to the apparent impairment in intellect which occurs in 
schizophrenia, depression, and perhaps in many neuroses as functional 
deterioration, with the implication that the intellectual abilities of the 
patient are intact but are merely not being used effectively.” Following 
this reasoning, the irreversible, organic deterioration effects treated in the 

* Based on a thesis submitted to the graduate faculty of the University of Minnesota 
in partial fulfillment of the requirements for the Ph.D. degree. Acknowledgment is due 


Professor 8. R. Hathaway and Dr. J. C. McKinley for their suggestions and encourage- 
ment in the prosecution of this study. 


375 








oo= On gh aye fee ae eee 
Ma ate i ISR eS 





See 





376 Howard F. Hunt 


present paper are not expected to appear in schizophrenia except when 
actual intercurrent brain damage has occurred. 

This report presents a practical method for the detection of brain 
damage by testing for the presence of concomitant intellectual deteriora- 
tion. The test development and validation are given without attempt at 
any theoretical discussion of the essential nature of deterioration. 

The psychologic, psychiatric, and social implications of the after- 
effects of brain damage are well known, and the need for early clinical 
detection of the condition has stimulated a number of investigators to 
develop examination techniques for the purpose. Goldstein and Sheerer 
(2), Strauss & Werner (9), and Halstead (4) have devised or described a 
number of valuable methods involving sorting or categorizing behavior. 
These techniques are useful clinical tools, especially for advanced cases 
and provide pertinent data on the psychopathology of organic brain 
damage. Methods yielding more readily quantifiable results, however, 
are frequently necessary in both the clinical and experimental situations. 

Babcock (1), Lidz, Gay, and Tietze (6), Simmins (8), Shipley (7), and 
Wesley (11) have developed quantitative examination methods which 
utilize the discrepancy between a person’s performance on tests which are 
sensitive and tests which are relatively insensitive to brain damage; the 
latter tests being thought of by some writers as serving to estimate the 
person’s predeterioration intellectual! level. This general methodology is 
particularly useful in that it clarifies the discrimination of relatively intelli- 
gent persons with deterioration from relatively dull persons who are other- 
wise normal since both would earn poor scores on the sensitive tests. 


The Test Battery 


The present examination method might best be described as a modifi- 
cation of the technique introduced by Babcock (1) and amplified by 
Wesley (11). It consists of three major divisions: the vocabulary test 
which is relatively insensitive to brain damage, the sensitive deterioration 
tests, and a group of interpolated tests. Table 1 lists the tests in the 
order of presentation.! 

The tests are grouped so that the clinician may stop after Test 5 and 
obtain a score on what is called the short form of the battery. This short 


_ form requires about 15 minutes to administer as compared with the 30 


minutes required by the longform. Scoring time is less than five minutes. 

The six tests sensitive to deterioration (Tests 1, 2, 3, 4, 15, and 16) 
involve learning and retention of new associations in both visual and 
auditory spheres. Tests 1 and 2 are the first and second recall trials, 


1The Tests and Manual are distributed by The University of Minnesota Press, 
Minneapolis, Minnesota. 




















Foe ae oa me 





Practical, Clinical Test for Organic Brain Damage 377 











Table 1 
Test Battery 
Note: The line below Test 5 indicates the termination of the short form of the test. 
Number 
of Test Test 
*Test 1 Exposure and first recall of 10 pairs of designs 
*Test 2 Second recall of design pairs 
*Test 3 Presentation and first recall of 10 pairs of words 
*Test 4 Second recall of word pairs 
Test 5 1937 Stanford-Binet Vocabulary Test 
t Test 6 Information 
tTest 7 Saying the months of the year 
t Test 8 Counting from 1 to 20 
t Test 9 Counting from 3 to 30 by 3’s 
+ Test 10 Attention test 
t Test 11 Counting backwards from 25 to 1 
t Test 12 Reversed digits 
+ Test 13 Saying the months backwards 
t Test 14 Serial subtraction of 3’s from 79 
* Test 15 Third recall of word pairs 
* Test 16 Third recall of design pairs 





* These are the deterioration tests proper. 
t These are the interpolated tests. 


taken consecutively, for two groups of five pairs of remotely related 
designs, and Test 16 represents the third recall trial for all ten pairs of 
designs. These design pairs, on plates, are shown briefly to the subject, 
and he is instructed to remember which designs were shown together in 
each pair. Following the presentation of a group of five pairs, the subject 
is shown one of the members of each design pair and asked to identify the 
design which accompanied it from among a group of ten designs placed 
before him on the table. 

Tests 3 and 4 are the first and second recall trials for ten pairs of 
relatively unrelated one-syllable words, and Test 15 represents the third 
recall trial for these word pairs. The pairs are read to the subject with 
the instruction to remember which words he heard together; then, he is 
given the first word of each pair and asked to respond with the word that 
accompanied it. Bonus credit is given for rapidity of response on the 
assumption that the reaction time will be shorter for the more thoroughly 
learned responses. 

The use of two different learning ‘“‘media’”’ makes possible an estimate 
of the deterioration status of persons with special sensory handicap, 





4g 
4 


Te eS) ey > eee 


BATE 58" 


alter 





378 Howard F. Hunt 


though the reliability and validity of fragments of the battery are un- 
doubtedly inferior to those of the full battery. 

The interpolated tests, which are speed and efficiency tests, have been 
included in the long form of the battery for several reasons. In the first 
place, they serve as validity indicators. Persons unable to perform these 
tests are probably too uncooperative or out of contact to be tested. 
Chronic, deteriorated schizophrenic patients as well as other uncoopera- 
tive or greatly disturbed persons may be discriminated from persons on 
whom the test can be used validly to detect organic brain damage. Ten- 
tative critical levels for this purpose have been set for each of the tests. 
In the second place, the interpolated tests provide a uniformly filled time 
interval between the second and third recall trials for the learning mate- 
rial, making possible the testing of retention as well as immediate recall. 
In the third place, a number of the interpolated tests are crude deteriora- 
tion tests similar to some of the standard psychiatric techniques used for 
the detection of this condition. Thus, they furnish a means of demon- 
strating the relative preservation of the subjects studied in this investiga- 
tion as well as giving meaning to the overlapping of scores between the 
criterion groups. Finally, these tests provide further data for research 
on problems of intellectual efficiency in various psychiatric conditions. 

The 1937 Stanford-Binet Vocabulary Test was selected for Test 5 
because of its widespread use and clinical convenience. It is adminis- 
tered, with minor variations, according to the authors’ directions. Place- 
ment of the vocabulary test immediately after the word-pairs test was 
designed to maximize retroactive inhibition. Though it is not necessarily 
assumed for this battery that vocabulary gives an accurate estimation of 
the subject’s predeterioration intellectual level, the subject’s vocabulary 
score and age combined provide a score which is relatively insensitive to 
abnormal brain damage. 


The Subjects 


The test battery was administered to two groups of subjects: the D or 
brain damaged group, and the C or control or “normal” group. Because 
of the fact that psychological tests are not necessary for the diagnosis of 
the more severe cases of deterioration, a special effort was made to include 
in the D group only those patients who were in good contact and coopera- 
tive and who were well enough preserved to resemble closely the type of 
patient whom the test must ultimately discriminate if it is to be clinically 
useful. Moreover, since many of the cases in which the question of brain 
damage comes up will also have psychiatric complications such as psy- 
chosis, psychoneurosis, economic and social marginality, or actual physi- 
cal disease, a special effort was made to include a majority of such persons 
in the C group in order to control for the effect of such factors. 











Practical, Clinical Test for Organic Brain Damage 379 


The D group consisted of 33 patients in various state institutions in 
Minnesota and in the United States Veterans’ Hospital at Fort Snelling. 
They all fulfilled the following requirements: 1. Medical diagnosis of 
organic cerebral damage. Whether or not the neuropsychiatric examina- 
tion clearly showed that the patient had deteriorated mentally was not 
crucial, though several of the subjects were also thus diagnosed. Patients 
with congenital conditions, birth injuries, and childhood brain trauma 
were excluded because the additional problem of arrested development 
in contrast to later loss of intellect might then be introduced. 2. Ability 
to speak and read the English language. 3. School attendance at least 
through the third grade. 4. Adequate muscular coordination and sen- 
sory acuity to enable the patient to perform the tests. 5. Adequate 
ability to cooperate and sustain the effort and attention necessary to per- 
form the tests, This included voluntary submission to the testing. 
6. The subjects had to be between 16 and 70 years of age. These criteria 
eliminated most of the seriously deteriorated cases. 

The C group presented a rather difficult problem in selection because 
of the fact that persons with minimal, undiagnosed brain damage are 
likely to be included among persons selected to match the D group cases 
in respects other than cerebral damage. The 41 cases of the C group were 
selected largely from state institutions and psychologically quiet “back- 
waters of society” such as the more or less custodial ward of the United 
States Veterans’ Hospital at Fort Snelling and the neuropsychiatric ward 
of the University of Minnesota Hospitals. These subjects were either 
marginal economically or were for some reason or other relatively poor at 
managing their own affairs. 

The following criteria had to be fulfilled for inclusion within the C 
group: 1. Absence of diagnosis of organic brain damage. This does not 
mean a diagnosis of no brain damage since a great many of the patients 
were not examined neurologically. Because of this, the C group might 
better be called the ‘‘no diagnosis of brain damage”’ group rather than the 
normal group. A rough attempt was made to rule out cases who showed 
obvious neurological abnormalities apparent to the examiner or a history 
of chronic alcoholism, head injury, convulsive disorders and syncope, 
rheumatic fever, diabetes mellitus, apoplexy, or known hypertension since 
these conditions are often associated with brain damage. It is probable, 
however, that truthful answers to questions concerning these conditions 
may not always have been obtained, at least among some of the hospital 
employees, many of whom believed, in spite of reassurance, that they were 
being given a job qualification examination. 2. The subjects also had to 
fulfill the requirements outlined under items 2, 3, 4, 5, and 6 in the section 
on the D group. 











(hy RIO F a. 








380 Howard F. Hunt 


Table 2 gives the diagnostic classification of both the C and D group 
cases. 








Table 2 
Diagnostic Classification of D and C Group Cases 
Diagnosis No. of Cases 
Brain Damaged Cases (D Group) 
I Fa Rurack Ghee sc sca cdoneccseevidees cane 13 
RF EE AACR PEE SO ORNE ie ey TE 5 
Epileptic Deterioration (Diagnosed)..................... » 4 
es on wade wee tans exisien 1 
Syphilitic Meningo-encephalitis.......................... 1 
Hypoglycemic Encephalopathy.......................4.. 2 
Hypertensive Encephalopathy........................... 1 
Skull Fracture with Sydenham’s Chorea.................. 1 
EE ee ee 1 
Meningioma of Left Frontal Lobe........................ 1 
Ne os in ess a bv: ore Oh 1 
Epilepsy with Intracranial Neoplasm......... Fe kay bl a 1 
33 
Control Cases (C Group) 
i En dina nak 6 hi oeneneh'd « haWemeeeeabns > 15 
Psychosis 
Pum Gobiaireela, . oi cee 4 
Manic-Depressive (Depressed).................. .... 1 
I 5 edhe s oa. d bs waideukinaSids krarkd nsep 1 
a wis cee oy «hoya oR aban 3 
en nian: bette bp > 4:04. eelmenid ss ¢ +0 eee 1 
te ss he tne pa bea dep puantes es vueme be 1 
ENE ee ee ee Pee 1 
Brucellosis (also Psychoneurosis)......................... 1 
Pathology of Intervertebral Disc......................... 2 
I nat hiss oe cen dhh dade e ss nnenwees sham ad ll 
41 





In order that the discriminating power of the tests might be evaluated, 
selected C and D samples of 25 cases each, equated on the basis of vocabu- 
lary and age, were drawn independently of their scores on the other tests, 
one sample from each group. Further statistical manipulation was con- 
fined to the selected samples, the residual 24 cases being kept as test cases 
to be scored later on the basis of norms derived entirely from the selected 
C sample in order to establish further the validity of the test battery. 

Age, vocabulary, and occupational status differences between the 
selected samples were insignificant. Both of the selected samples were 
also similar to the general employed adult male population with respect to 








Practical, Clinical Test for Organic Brain Damage 381 


occupational status (3 and 10). The Chi squared test was used to check 
for statistical significance of the differences. 

As a result of this matching, these two selected C and D samples may 
be assumed to differ in the deterioration test performance chiefly on the 
basis of the brain damage factor. 


Results 


Comparison of the performance of the two selected samples on the 
various tests of the battery revealed striking differences on the deteriora- 
tion tests and much smaller differences on the interpolated tests. Using 
raw scores, all of the deterioration tests showed critical ratios between the 
two groups greater than 6.8 and a total overlap of 50 per cent or less (total 
overlap is defined as the percentage of the total number of cases falling 
between the highest scoring D case and the lowest scoring C case, in- 
clusive). Though the differences between the groups on the interpolated 
tests were significant or of borderline significance for the most part, none 
of the critical ratios was over 3.8 and the total overlap was 90 per cent or 
more for all of these tests except Test 9, which showed an overlap of 68 
per cent. This great overlap on the interpolated tests emphasizes the 
superior discriminative power of the deterioration tests as contrasted with 
the interpolated tests. Here is evidence that the usual psychiatric exam- 
inations which rely on tests similar to the interpolated tests are not suffi- 
ciently sensitive to discriminate between slightly deteriorated cases and 
normals. A weighting procedure was developed and applied, but it 
proved to add little to the discrimination. 

Table 3 presents a detailed statistical comparison of the performance 
of the selected C and D samples on all of the tests. 

Though high reliability is probably not crucial in the case of an exam- 
ination method of demonstrated validity, correlations for the selected C 
group between Tests 1 and 2 and between Tests 3 and 4 reveal substantial 
consistency of performance on consecutive trials for both types of learning 
material. The correlation for design-pairs was .72 and that for word- 
pairs was .80, and the average of these two using Z functions was .76. 
These correlations probably underestimate the true consistency of per- 
formance because alternate trials were used rather than alternate items. 
For the selected C sample, the average intercorrelation for the deteriora- 
tion tests was .40. 

The correlation between total score on the deterioration tests and 
vocabulary was .51, and the correlation between these scores and age was 
— .37, while the correlation between age and vocabulary scores was .07. 
These correlations are consistent with the findings of previous investi- 
gators on the relationships between these several variables. Probably, 





ee 


382 Howard F. Hunt 


Table 3 


Comparison of Selected Control Group (N = 25) and Brain Damaged 
Group (N = 25) on All Tests 


Page, AO ae > ee See 











Control Cases Brain Damage Cases 





Critical Per Cent 


—————— a 





: Tests Mean ‘SD. Mean §8.D. Ratio Overlap 
; 1. Design Pairs, 11.7 4.1 2.6 2.0 9.9 20 
First Recall 
; 2. Design Pairs, 15.4 5.6 31 862.7 9.9 24 
Second Recall 

: 3. Word Pairs, 14.1 7.6 3.2 2.8 6.8 50 
. First Recall 
: 4. Word Pairs, 17.8 6.7 3.8 3.5 9.2 42 
} Second Recall } 
; 5. Vocabulary 21.7 5.9 21.3 5.5 0.3 100 

6. Information 13.3 | 13.1 3.3 0.3 100 
7. Months Forward 26.9 1.0 24.8 5.3 1.9 92 
; 8. Counting, 1-20 26.2 1.3 24.9 2.0 2.8 94 
i 9. Counting, 3-30 54.2 3.4 46.6 12.4 3.0 90 
10. Attention, not scored 

11. Counting, 25-1 36.7 4.8 31.5 9.6 2.4 92 

12. Digits Backward 4.0 0.9 3.1 0.8 3.8 98 

13. Months Backward 54.1 16.2 44.9 20.5 1.8 100 
: 14. Counting, 79-1 244 16.1 12.0 9.7 3.3 68 
/ 15. Word Pairs, 18.7 6.0 4.4 4.8 9.4 42 
: Third Recall 
i 16. Design Pairs, 14.7 5.8 2.7 2.3 9.6 24 
' Third Recall 


Total Score on 
Deterioration Tests 92.3 28.1 19.9 13.9 11.5 12 





when a larger sample of cases is studied, the relationship between age and 
: total score will be found to be curvilinear, but the obtained correlations 
: are the best estimates of the relationships which could be obtained from 
these data. The multiple correlation of total deterioration test score 
with age and vocabulary was .65. 

By use of the regression weights determined from the selected C sample 
statistics it is possible to predict the expected score on the deterioration 
tests assuming no deterioration has occurred. This predicted deteriora- 
tion test score is based upon vocabulary performance and age with higher 
vocabulary predicting a higher score and greater age predicting inversely 
: a lower score. 

3 In order to evaluate the discriminative power of the test battery, each 
: subject’s predicted score on the deterioration tests was determined by the 


e Pe 








ee 





Practical, Clinical Test for Organic Brain Damage 383 


regression weights, and the difference between this and his obtained score 
was found. These difference scores represent the effect of deterioration 
since they are the difference between the person’s actual score on a test 
sensitive to deterioration and the score that should be obtained assuming 
the person is normal and obtains a score commensurate with his vocabu- 
lary level and age. The raw differences were converted into standard 
scores by dividing them by the standard error of estimate. The resultant 
standard scores were stated in terms of modified ‘“T’’ scores, that is, the 
predicted value was set at 50 and the standard error of estimate at 10. 
The arrangement was such that persons whose actual scores fell below 
their predicted scores received T scores above 50 and vice versa. Thus, a 
person with an actual score one standard error of estimate below his 
predicted score would obtain a T score of 60. 

The probability of the occurrence of T scores in excess of any given 
magnitude in the examination of ‘‘normal’’ persons may be inferred from 
the integrals of the normal curve. Thus, if a subject’s T score is high, as 
are the T scores of persons who are deteriorated, the magnitude of that 
T score is a rough guide to the probability of that person’s belonging 
to the “normal” group, T scores over 70 being associated with relatively 
low probabilities. 

As will be indicated below, the details of this statistical transformation 
are not vital to the effectiveness of the test since that is established by 
empirical methods. The T score is used as a convenient tool to obtain a 
score distribution having an approximate relation to standard probabili- 
ties of the normal curve, and its shortcomings are irrelevant to the main 
thesis. 

Figure 1 shows the distribution of T scores for both of the selected 
samples and the residual cases of the whole C and D groups. Considering 
the relatively marginal character of most of the C and D group cases as 
well as the extensive overlapping between the two groups on the inter- 
polated tests, the overlapping of the T scores between the C and D groups 
is negligible. If a tentative critical score be set at 68, none of the selected 
C group, only two cases of the residual C group, one case of the selected D 
group, and one case of the residual D group would be misclassified. The 
bracketed C case represents a doubtful ‘‘normal’’ case which the writer 
suspects to be a case of undiagnosed brain damage. ‘The suspicion is 
based on clinical grounds other than test performance. This patient was 
studied early in the investigation before the surprisingly high incidence 
of undiagnosed brain damage among the “normal” population became 
apparent, and at that time a diagnosis of psychoneurosis was taken at face 
value. If his score be ignored, only one of the C group cases would be 
misclassified and the total overlap between the C and D groups would be 
reduced from 13.5 per cent to 10.8 per cent. 





ee 


384 Howard F. Hunt 


In connection with the demonstration of the validity of the battery, 
: it should be emphasized that the regression weights, multiple correlation, 
; and standard error of estimate were derived from the performance of the 
selected C sample only. All T scores, therefore, represent ordinates of a 


T Scores 
26 -30 


Brain Damage Cases iinet 
| © Selected 36-40 
d Residual 41-45 
| 40-50|\©©© « 
3 51-55 ©OOOOOOOO « « « 
56-001|©O©© « 


“ere © ¢ ¢€ 
d 66 
©} o7 
©] 68 
| ©} 09 
TO 
TI . 
' 72 c 
73 | 

T4 
7 ic) 
76 
77-8! 


© 
OO 
©© 
© ©) 62-806 
OO 
d © 
© 


Control Cases 


Selected 
© Residual 
(c) Doubtful 


©0O0O 
OO> 
900° 








Ojo 


© 








3 aeRO s 


87-91 

92-96 

97-101 
102-106 


mene en re 


—- 








: Fig. 1. Distribution of T scores for long form of the test. 
; Note: The class intervals have been adjusted to give maximum detail at the point 
f of overlap between the two distributions. 


theoretical distribution curve based on the performance of the selected C 
sample cases. All of the D group cases and the residual C group cases are 
test cases to establish validity since they make no normative contribution, 
and the observed differentiation between the C and D group scores has not 
been enhanced by statistical manipulation. 

No statistically significant differences in T scores were found between 
the various diagnostic subgroups within the C and D groups or between 














Practical, Clinical Test for Organic Brain Damage 385 


males and females in the C group. Significance of these differences was 
determined by the use of Student’s ‘‘t’’ test. 

By reference to Table 3, it can be seen that total score on the deteriora- 
tion tests discriminated between the C and D groups almost as well as 
did the T scores based on the statistical procedures outlined above. The 
actual cases which fell into the borderline category with each method of 
evaluation differed, however. The agreement between the patient’s clini- 
cal status and the magnitude of his T score was close, whereas such was not 
the case with respect to raw total score, thus justifying the use of the T 
score technique even though a critical value could be set for the raw total 
score on the deterioration tests which would adequately separate the C 
and D groups. 

The short form of the battery discriminated between the C and D 
groups almost as well as did the long form. The correlation between the 
scores on the long form and on the short form was .99, using all of the 
standardization cases. 


Discussion 


Though the number of cases studied in this investigation has been rela- 
tively small, the validity of the test battery as a discriminating instrument 
is statistically established. This validity determination has been con- 
firmed by subsequent clinical use of the test with the tentative norms 
already obtained. Clinical use has also shown the examination to be 
simple to administer and practical for routine detection of early and 
otherwise difficult to detect organic brain damage. 

As is well known, a certain percentage of the cases in which actual 
brain damage has occurred will show no neurological symptoms and would 
go undetected if only the standard neuropsychiatric examination methods 
were used for diagnosis. Thus, in the clinical examination of diagnostic 
problems, the clinician should expect that a certain number of cases will 
show T scores above 68 or 70 without concomitant neurological symp- 
tomatology. This finding in itself is sufficient to justify the strong sus- 
picion of cerebral pathology, a suspicion which will often be strengthened 
or confirmed by the appearance of new evidence for possible brain damage 
from a more extensive history or physical examination. A certain num- 
ber of cases of this sort will appear in a supposedly normal group selected 
as was the C group in this study, and some of the overlapping between the 
C and D groups may be accounted for in terms of this factor. 

In the clinical examination situation, moreover, a certain number of 
cases with obvious neurological symptomatology but with relatively low 
T scores will be encountered. This is also to be expected since, within 
rough limits, the placement of the lesion rather than its magnitude is 








é 
i 
: 
; 
: 


x 7, 
alle So ths Rete LF 





386 Howard F. Hunt 


crucial for the production of neurological signs. Disorders such as en- 
cephalitis or multiple sclerosis or small lesions in the motor cortex or mid- 
brain may produce very marked neurological symptoms accompanied by 
little loss of intellectual power, and when this occurs low T scores are 
usually obtained. 

In general, the test battery has been developed to meet a routine need, 
and it has proved relatively satisfactory for use in the neuropsychiatric 
clinic. Most of its usefulness lies in the additional evidence it offers to 
support clinical suspicion of organic damage from various causes when 
other available evidence is inconclusive. It also aids in prognostic state- 
ments since high scores indicate irreversible damage, though it must be 
remembered that such patients are often able to continue in society when 
routine demands on their ability to learn are minimal. 

The battery is appropriate for administration to the majority of 
patients of either sex with an estimated mental age of 8 or over, who are 
between the ages of 16 and 70, and who have had a relatively normal 
exposure to American culture and the English language. 


Received July 24, 1943. 


References 


1. Babcock, H. An experiment in the measurement of intellectual deterioration. 
Arch. Psychol., N. Y.,1930. No. 117, 1-105. 

2. Goldstein, K., and Sheerer, M. Abstract and concrete behavior; an experimental 
study with special tests. Psychol. Monogr., 1941, 53, No. 2 (whole No. 239). 
Pp. 151. 

3. Goodenough, F. L. and Anderson, J. E. Experimental child study. New York: 
The Century Co., 1931. Pp. 501-512. 

4. Halstead, W. C. A preliminary analysis of grouping behavior in patients with 
cerebral injury by the method of equivalent and non-equivalent stimuli. Amer. 
J. Psychiat., 1940, 96, 1263-1294. 

5. Kendig, I.,and Richmond, W. V. Psychological studies in Dementia Praecox. Ann 
Arbor: Edwards Bros., Inc., 1940. Pp. 1-166. 

6. Lidz, T., Gay, J. R., and Tietze, C. Intelligence in cerebral deficit states and 
schizophrenia measured by the Koh’s block test. Arch. Neurol. Psychiat., 1942, 
48, 558-562. 

7. Shipley, W. C. A self-administering scale for measuring intellectual impairment 
and deterioration. J. Psychol., 1940, 9, 371-377. 

8. Simmins, C. Deterioration of “G” in psychotic patients. J. Ment. Sci., 1933, 79, 
704-734. 

9. Strauss, A. A., and Werner, H. Disorders of conceptual thinking in the brain- 
injured child. J. Nerv. Ment. Dis., 1942, 96, 153-172. 

10. Terman, L. M., and Merrill, M.A. Measuring intelligence. New York: Houghton, 
Mifflin Co., 1937. Pp. 14 and 302 ff. 

11. Wesley,S. Medford. A study of the use of recent memory tests in the measurement 
of intellectual deterioration. Unpublished Ph.D. Thesis, University of Min- 
nesota, 1941. 











Limitations in the Use of Intelligence Test Performance 
to Detect Mental Disturbance 


Ann Magaret 
Stanford University 
and 
Clare Wright 


Sonoma State Home 


One of the many temptations which beset the busy clinical psycholo- 
gist is the use of differential responses of patients to items of standardized 
intelligence scales as evidence for or against a diagnosis of mental dis- 
turbance. The informal conversation of clinicians, and, to a lesser extent, 
the practical literature of test administration and interpretation, make 
frequent mention of “cues” pointing to the presence of disturbing factors 
in test performance. The difference between verbal and performance 
items, a relatively high score on vocabulary tests, special difficulty with 
items depending upon the analysis of spatial patterns, an unusual scatter 
of successes and failures—these cues are widely, if unsystematically, used 
when mental confusion is suspected. 

Ideally, of course, the use of such cues from whatever measuring 
devices may be currently employed should wait upon studies of the 
characteristic responses of carefully defined clinical groups. Actually, 
the insistence of the diagnostic and therapeutic problems confronting 
the clinical psychologist makes this procedure not only undesirable but 
often impossible. Moreover, in the intensive study of the single indi- 
vidual which is the primary concern of the clinician, diagnosis of any 
patient depends upon many factors besides those which prove important 
in the differentiation of large groups of patients from one another. 
Nevertheless, results from group studies will occasionally prove of value 
to the clinician, either in suggesting areas where diagnostic cues might 
be sought, or in warning against the ready acceptance of cues which 
prove of questionable value even with large groups. 

In an earlier study Magaret (11), using the responses of a group of 
adult hospitalized schizophrenics to items on the Wechsler-Bellevue 
Adult Intelligence Scale, found reliable differences between schizophrenic 
and control performance on some of the tests. Age was held constant 
by limiting the group to patients between thirty and forty years of age. 
Hence, such results might be explained in at least two ways: (1) Since 

387 





388 Ann Magaret and Clare Wright 


the level of performance of the schizophrenic patients was in general 
below average, the differences might be characteristic merely of indi- 
viduals of limited intellectual ability; or (2) these differences may be 
characteristic of the schizophrenic syndrome. Only if the latter hy- 
pothesis can be verified is the use of the differences as cues in diagnosis 
justified. 

The possible effect of low intelligence, uncomplicated by age or by 
the schizophrenic process, upon differential test performance has not 
been indepeadently studied with the Wechsler-Bellevue scale. Wechsler, 
Israel, and Balinsky (22) have attacked the problem by comparing test 
responses of border line defective and mentally deficient individuals to 
the items of this scale, but since the subjects range in age from 20 to 49 
years, the results are complicated to an unknown degree by the presence 
of changes due to age alone. That test performance during this thirty- 
year span will almost certainly show some characteristic changes resulting 
from age may be inferred from studies with normal adults (14), and from 
studies of adult morons by Johnson and Fernow (7) and by Wright (24). 

The purpose of the present investigation is therefore to examine the 
test responses of a group of non-psychotic individuals of below-average 
intelligence, having the same age range as the schizophrenics in the 
original study. The present research offers evidence that limited intel- 
lectual capacity is not alone the explanation for differences obtained 
earlier between schizophrenics and controls. The results further suggest 
certain areas of performance which reliably differentiate schizophrenic 
from mentally deficient subjects, thus supporting the second hypothesis 
mentioned above. At the same time, however, the results cast some 
doubt on the wisdom of using certain other cues often mentioned as 
differentiative between these two syndromes. 

Both results may have special interest at the present time, when the 
Wechsler-Bellevue scale, in revised form, is being used for purposes of 
selection and classification in various branches of the armed forces, and 
when any valid cues which might aid in the screening of neuropsychiatric 
suspects are of the utmost importance. 


Plan of the Investigation 

Subjects of the present study are forty mental defectives residing in 
Sonoma State Home, who are compared with the eighty hospitalized 
schizophrenics reported earlier, and with 210 control individuals who are 
members of the standardization group of the Wechsler-Bellevue scale. 
Range of chronological age for all three groups is from thirty to thirty- 
nine; mean age of the mental defectives is 35.5 years, and of the schizo- 
phrenics, 35.7. All subjects included in the present report were born in 
English-speaking countries and give no evidence of language handicap. 




















Limitations in Intelligence Test 389 


The selection of the group of schizophrenics has already been de- 
scribed (11). It should be mentioned here, however, that no member 
of the group had ever been diagnosed as mentally defective or as psychotic 
with mental deficiency. 

In order to secure a group of mental defectives roughly comparable 
in intelligence to the schizophrenics, only morons were included in the 
investigation. The range of 1916 Stanford-Binet IQ’s for these forty 
subjects is from 50 to 71, with a mean IQ of 57.7. An individual of 
moron grade was excluded from the present group if he gave evidence of 
secondary causation, of a psychotic or neurotic condition, or of a sensory 
or physical defect such as to interfere with his test performance. It is 
of interest to note that every individual in the institution who fulfilled 
these criteria of age, intelligence, and diagnosis is included in this report. 

The eleven tests comprising the Wechsler-Bellevue Adult Intelligence 
Scale (21) were administered by the writers to each of the forty mentally 
deficient subjects. This scale is composed of six verbal tests (informa- 
tion, comprehension, arithmetic reasoning, digit repetition, similarities, 
and vocabulary); and five performance tests (picture completion, picture 
arrangement, object assembly, block designs, and digit-symbol substitu- 
tion). Scores for each sub-test were converted to standard scores from 
the tables provided by Wechsler. 

Results considered in terms of standard scores alone, however, fail to 
answer the question raised by the clinical psychologist concerning cues 
which differentiate between mental deficiency and schizophrenia. What 
the clinician, administering and scoring a standardized intelligence test, 
observes for any single patient is a list of eleven scores. The variations 
of these scores for this patient (which of the scores are relatively high, 
which relatively low) constitute the data from which he must draw his 
conclusions regarding the presence or absence of mental disturbance. 
A more useful measure, therefore, would take into account for each 
subject the deviations of sub-test scores from the mean score obtained 
by that subject. The question here is one of intra-individual rather 
than inter-individual differences in performance. 

In order to answer the practical question more adequately, therefore, 
standard scores were converted to deviation scores. For each subject, 
the mean standard score on the eleven tests was computed; this mean 
was then subtracted from the subject’s standard score on each of the 
eleven tests. Hence, a plus value for such a deviation score signifies 
that the subject’s performance in the particular test is above his own 
mean performance on the scale; a minus value for such a deviation score 
indicates that the subject’s performance in the particular test is below 
his own mean performance. 








390 Ann Magaret and Clare Wright 


Results and Discussion 


Means and standard deviations of distributions of standard scores for 
the morons, the schizophrenics, and the control group are shown in 











Table 1 
Means and Standard Deviations of Distributions of Standard Scores 
Morons Schizophrenics Controls 
Wechsler-Bellevue 
Sub-Tests N M o N M o N M o 
Information............ 400 38 138 80 81 3.4 210 98 3.2 
Comprehension......... 40 38 2.0 80 60 3.5 210 9.7 33 
pee 40 07 13 80 54 39 210 9.2 34 
Nii dicks ah tole nahn 40 3.7 19 80 66 26 210 89 3.2 
Similarities. ............ 40 28 18 80 68 40 210 95 29 
WET. on sons tacts = Ga iF 74° 85 2.7 210 98 32 
Picture completion...... 40 3.0 2.5 80 58 3.7 210 93 34 
Picture arrangement... .. 40 31 2.0 80 54 3.2 210 90 3.3 
Object assembly........ 40 40 1.4 80 74 36 210 9.2 3.2 
Block designs........... 40 3.7 24 80 7.2 3.5 210 94 34 
Substitution............ 40 24 18 80 52 2.9 210 92 33 





* In the case of six patients (obtained from the Stanford Hospital) vocabulary tests 


had not been administered. 


Table 1. It is apparent from this table that the general level of per- 
formance on the eleven tests is highest for the controls, next for the 
schizophrenics, and lowest for the morons, as would be expected. Dif- 











Table 2 
Means and Standard Deviations of Distributions of Deviation Scores * 
Morons Schizophrenics Controls 
Wechsler-Bellevue 

Sub-Tests Mean o Mean o¢ Mean o¢ 
Information./............ +06 18 +15 19 +0.5 2.4 
Comprehension........... +0.6 2.0 —-06 2.0 +0.4 2.5 
ES —2.5 13 —l1 .23 -0.2 2.6 
anc Slade oagueiiine +05 2.1 +10 23 -04 28 
EE. cc cecnccneenne -04 1.7 +02 2.4 +0.1 2.0 
Voonbilary ..........0000 +2.0 1.5 +2.0 1.6 +0.5 2.4 
Picture completion........ -0.2 2.1 -0.8 2.2 +0.0 2.7 
Picture arrangement....... -0.1 1.46 —12 2.1 -0.3 2.7 
Object assembly.......... +0.7 3.1 +09 2.6 -0.1 2.9 
Block designs............. +05 18 +0.6 2.1 +0.1 2.4 
Substitution... .........+. -08 13 -13 1.9 -—0.1 2.4 





* A plus value signifies that the mean value of the single test is above the mean of 
all eleven tests; a minus value signifies that the mean value of the single test is below 
the mean of all eleven tests. 





























Limitations in Intelligence Test 391 


ferences between the scores on individual tests within the moron, schizo- 
phrenic, and control groups are also apparent, however, and it is with 
these differences that this study is particularly concerned. 

In order to illustrate more clearly the extent to which various tests 
reflect differences among the groups of subjects, means and standard 
deviations of the distributions of deviation scores are presented in Table 2. 
A casual inspection of this table reveals two important points: (1) As 
compared with the controls, both the schizophrenics and the morons 
have higher deviation scores. Since these scores are actually deviations 
from a mean, they constitute one measure of variability of performance 


Table 3 


Critical Ratios of the Differences between Groups in Mean Deviation Scores * 











Morons vs. 
Schizophrenics Morons vs. Controls 
Wechsler-Bellevue 
Sub-Tests Du opm Dievu Dus oom ~=Diovm 

Ce ee —0.9 0.36 2.51 +0.1 0.33 0.30 
Comprehension............ +1.2 0.39 3.07 +0.2 0.36 0.56 
NS a's Sune ic tnen's —-14 034 4.17 —2.3 0.28 8.30 
Digit repetition............ +05 042 1.20 +09 038 2.39 
ies ah odeee cue oe —0.6 0.38 1.57 —-0.5 030 1.65 
EE: wwosaccscBaases 0.0 0.31 0.00 +15 0.29 5.16 
Picture completion......... +0.6 042 1.42 -—0.2 039 0.51 
Picture arrangement........ +1.1 035 3.18 +0.2 0.32 0.62 
Object assembly........... —0.2 0.57 0.35 +08 0.54 1.50 
Block designs.............. —0.1 0.37 0.27 +0.4 0.33 1.20 
ING ons nccncouavee +0.5 0.30 1.68 —0.7 0.27 2.63 





* A plus value signifies that the mean value of the moron group is relatively higher 
than that of the schizophrenic or control group; a minus value signifies that the mean 
value of the moron group is relatively lower than that of the schizophrenic or control 
group. 


—of consistency of score from test to test. When the arithmetic sum of 
these scores is calculated for each group to obtain an over-all estimate of 
variability of performance, the control group yields a value of 2.7, while 
the morons yield a value of 8.9, and the schizophrenics one of 10.3. 
(2) There are marked differences among the groups in magnitude of 
deviation score for the various sub-tests. 

In the study reported earlier, the schizophrenics were superior ! to 
the controls in deviation scores on tests of information, vocabulary, and 
object assembly; and inferior in deviation scores on comprehension, 
arithmetic, picture completion, picture arrangement, and digit-symbol 
substitution. Table 3 presents critical ratios of the differences in devia- 


1 In the previous investigation, as in the present one, a critical ratio of 3.0 is accepted 
as a measure of significance. 





392 Ann Magaret and Clare Wright 


tion scores between morons and controls and between morons and schizo- 
phrenics. Two tests reliably differentiate morons from controls: arith- 
metic reasoning, on which the morons are inferior to the controls in 
deviation score; and vocabulary, in which the morons are superior to the 
controls in deviation score. 

The tendency for the feeble-minded to be relatively high in vocabu- 
lary and relatively low in arithmetic reasoning has been found also by 
Bradway (3). Administering the Stanford Achievement Test to a group 
of 53 moron and borderline subjects, she found the mean score for arith- 
metic reasoning to be 1.9 years below that for word meaning. The 
critical ratio of this difference between the means is 4.9, satisfying the 
criterion for significance. 

Three tests reliably differentiate morons from schizophrenics: com- 
prehension and picture arrangement, in which the schizophrenics are 
inferior to the morons; and arithmetic reasoning, in which the schizo- 
phrenics are superior to the morons. Consideration of the interrelation- 
ships of these differences may indicate more clearly the justifiability of 
using certain cues for the detection of mental disturbance. 

It is clear from Tables 2 and 3 that, although both schizophrenics and 
morons are significantly below the controls in arithmetic reasoning scores, 
the morons are significantly inferior to the schizophrenics. Perhaps we 
have here a useful “sign” of mental deficiency as distinguished from 
schizophrenia. The arithmetic test bears the most apparent relation to 
formal schooling, thus putting the moron back into the unpleasant situa- 
tion in which his inability to use symbols and abstractions in reasoning 
to a logical and acceptable conclusion was daily made apparent to him. 

It is likewise clear from Tables 2 and 3 that both schizophrenics and 
morons are inferior to controls in tests of picture arrangement and com- 
prehension, the schizophrenics significantly, the morons not significantly. 
The schizophrenics, however, are significantly inferior to the morons in 
these two tests. Perhaps we have here a useful “sign” of schizophrenia 
as distinguished from mental deficiency—a sign which may reflect the 
schizophrenic’s characteristic difficulty with handling causal relations, 
his ineptitude in dealing with practical social situations, his inability to 
synthesize discrete units, or some other more fundamental cause. 

If there are thus some suggestive cues which may prove useful in 
detecting mental disturbance from performance on the Wechsler-Bellevue 
scale, there are.other cues mentioned repeatedly in the literature which, 
for these patients and these tests, are not verified. Take, for example, 
the question of variability of performance as characteristic of mental 
disturbance. Wechsler (21, pp. 149-150) suggests that large inter-test 
discrepancies in scores are characteristic of schizophrenics, whereas feeble- 








Limitations in Intelligence Test 393 


minded subjects are limited in their inter-test variability. Moreover, a 
long and respected line of research, beginning with the work of Pressey 
and Cole, using the Yerkes point scale (17), and including such recent 
studies as those by Harris and Shakow (5) and Kendig and Richmond (8), 
suggest that variability of performance between sub-tests is indicative 
of the presence of mental disturbance. That such variability may accom- 
pany any deviation from average performance, however, is suggested by 
the work of Merrill (13), who found retarded and superior children to 
scatter more than normal children on the Stanford-Binet scale. Partly 
because variability may suggest deviations other than those characteristic 
of mental disturbance, considerable doubt has recently been cast on the 
clinical use of numerical measures of scatter. 

The present study indicates a wide difference in over-all variability 
between the two abnormal groups and the control subjects, as suggested 
earlier. There is even some evidence that the moron group may be dis- 
tinguished from the schizophrenic group on this basis, although the 
difference in variability between the two groups does not meet our 
criterion of statistical significance. If the arithmetic sum of deviation 
scores is calculated for each patient individually in each group (again 
making use of the data which the clinician would be able to observe), the 
mean variability thus defined is 21.5 for the schizophrenics and 18.9 for 
the morons; the critical ratio of this difference is 2.4. Some idea of the 
practical difficulty which would confront the clinician who attempted 
to make use of this cue in diagnosis, however, is gained from examining 
the overlap between the distributions of variability scores for the two 
groups. The discrepancy among deviation scores for one moron subject 
is greater than that of any schizophrenic, and, conversely, three schizo- 
phrenics show discrepancies among deviation scores which are less than 
those of any moron. 

Similar doubt is cast on the use of three other cues. Because “‘old’’ 
associations are considered less likely to be affected by disease processes 
than “new” associations, the discrepancy between scores on vocabulary 
tests and scores on other tests more dependent upon newly-acquired 
habits has often been used as an index of mental deterioration and as a 
means of distinguishing psychotics from mental defectives. The argu- 
ment here is that performance on vocabulary items is indicative of a 
patient’s pre-psychotic intellectual level, and that his performance on 
tests requiring new learning or emphasizing speed is indicative of his 
present intellectual efficiency. Wechsler (21, p. 149) notes among the 
diagnostic signs of schizophrenia a high vocabulary score. Systematic 
use of a discrepancy score in the identification of mental deterioration 
has been made by Babcock (1), who constructed a battery of tests to 








394 Ann Magaret and Clare Wright 


yield an index of intellectual efficiency, and by Schwarz (20) and Witt- 
mann (23), following Babcock’s initial lead. Representative studies 
suggesting that a relatively high vocabulary score is characteristic of psy- 
chotics as distinguished from mental defectives include those of Pressey 
(16), Piotrowski (15), Jastak (6), Malamud and Palmer (12), and David- 
son (4). 

In apparent contradiction to these findings, however, for the patients 
in the present study, using the Wechsler-Bellevue scale, the difference in 
mean deviation score between schizophrenics and morons in the vocabu- 
lary test is zero. Both groups obtain a mean deviation score on this 
test of +2.0, and for both groups this is the highest score obtained on 
the scale. It seems indisputable that, as compared to their performance 
on the scale as a whole, the performance of both schizophrenics and adult 
morons on the vocabulary test is superior. It will be recalled that both 
these groups were superior to the control group in deviation score on the 
vocabulary test, and that the differences between morons and controls 
and between schizophrenics and controls were statistically significant. 
Moreover, it will be noted from Table 2 that even the controls aged 30 
to 39 are relatively high on the vocabulary test, obtaining a mean devia- 
tion score of +0.5, which is one of the two highest scores achieved by 
these controls on this scale. 

If previous studies are accepted as revealing a correlation between 
relative performance in vocabulary and mental deterioration, then the 
results of the present investigation suggest that morons and schizo- 
phrenics between the ages of 30 and 39 may be characterized by lowered 
mental efficiency. Lacking regular tests of intelligence over an extended 
period of time, it is impossible to establish this point definitely. Some 
evidence that morons at the age of thirty have decreased as much in 
intellectual functioning as those of any older age group is given by the 
study of adult morons referred to earlier (24). Similar evidence of 
lowered mental efficiency in schizophrenia is given by the investigations 
already mentioned, although whether the term ‘‘deterioration’” should 
be applied to such decreased efficiency is questionable, in view of the 
improvements in mental functioning which have been observed in cases 
of remission. 

Whatever the explanation of the lack of difference between schizo- 
phrenics and mental defectives in vocabulary deviation score may prove 
to be, the important point to be made here is that, as a diagnostic cue 
for distinguishing mental deficiency from schizophrenia, relative per- 
formance on the test of vocabulary on the Wechsler-Bellevue scale is 
open to some suspicion. 

One other cue often mentioned as helpful in distinguishing mental 
disturbance from uncomplicated mental deficiency is performance on the 

















Limitations in Intelligence Test 395 


block designs test. Wechsler (21, p. 154) cites this test as one of the two 
most critical tests in differentiating between simple schizophrenia and 
mental deficiency. The “schizophrenic index” proposed by Rabin (19) 
for detecting schizophrenic trends depends, among other factors, upon a 
high score in the block designs test of the Wechsler-Bellevue scale. 
Bolles and Goldstein (2), using the complete form of this test as originally 
devised by Kohs (9), find that schizophrenics have particular difficulty 
with the designs. A recent study of the Kohs test as applied to fifteen 
schizophrenic subjects (10), however, reports no significant difference 
between Kohs and vocabulary mental age. 

Once again, the mental defectives and schizophrenics studied in the 
present investigation fail to obtain significantly different deviation scores, 
as shown in Table 3. Both groups, moreover, score farther above their 
own mean level of performance in the block designs test than does the 
control group of non-psychotics. Again caution in using this often- 

suggested cue for the diagnosis of mental disturbance is indicated. 

’ The discrepancy between scores for an individual patient on a verbal 
and on a performance test is a measure widely used for differential 
diagnosis and one to which these data are appropriate. The clinician, 
having administered the Wechsler-Bellevue scale, has, in addition to a 
full-scale intelligence quotient, both a verbal and a performance quotient. 
Wechsler (21, p. 145) says that “‘such discrepancies are frequently asso- 
ciated with certain types of mental pathology. . . .” He also states: 
“In so far as the diagnostic significance of large differences between 
verbal and performance ability as a whole is concerned, the general 
finding is that in most mental disorders impairment of functioning is 
greater in the performance than in the verbal sphere. This holds for 
psychoses of every type, organic brain disease, and to a lesser though still 
large degree, in most psychoneuroses. On the other side of the fence 
there are only two groups. One is the adolescent psychopath (without 
psychosis) and the other the high grade mental defective. Both of these 
do better on performance than on the verbal tests’ (21, p. 146). Rabin 
(18) finds a mean superiority of verbal over performance quotients of 
5.89 for his group of 76 schizophrenics aged 16 to 49 years. The critical 
ratio of this difference is 4.16 and it contributes to his “schizophrenic 
index.”’ 

For each subject in the present investigation, his performance quo- 
tient was subtracted from his verbal quotient. Here again, we deal with 
a measure which is available to the clinician after a test has been ad- 
ministéred. For the differences so obtained, the algebraic mean with 
its standard error was computed for each group. In the schizophrenic 
group the mean discrepancy shows a superiority of the verbal over the 
performance quotient of 0.16 IQ points. The critical ratio of these 








396 Ann Magaret and Clare Wright 


differences is 0.13. For the feeble-minded, the mean performance quo- 
tient exceeds the verbal by 1.18 IQ points, with a critical ratio of the 
difference of 0.58. Thus, although the differences are in the expected 
direction for both groups, they fall far short of our criterion of significance. 
Since, however, the two groups score in opposite directions on this 
measure, it is possible that they differ significantly from each other. 
This difference is 1.34, with a critical ratio of 0.56, again failing to 
approach our criterion of significance. 

That it is unwise to explore such factors in clinical groups selected 
with no regard to variations in age within the groups is again made 
apparent. Age would seem to be the most important factor accounting 
for the differing results of the present study of patients aged 30 to 39 
years and Rabin’s group aged 16 to 49 years. For the feeble-minded 
group, at least, there is some corroborative evidence on this point. 
Johnson and Fernow (7) find young feeble-minded subjects to score 
higher on performance tests, whereas those who are older succeeded 
better with verbal tests. It is of some interest that their group of 47 
patients aged 32 to 39 (the division corresponding most closely to the 
subjects of the present investigation) shows a smaller discrepancy be- 
tween the two scores than does any other age group from 8 to 79 years. 


Summary and Conclusions 


In an attempt to evaluate certain widely-accepted cues for the diag- 
nosis of mental disturbance from performance on a standardized intelli- 
gence scale, the Wechsler-Bellevue scale was administered to 40 morons 
and 80 schizophrenics, aged 30-39, and scores were compared with those 
of 210 controls of the same age range. Comparisons were made in terms 
of deviation scores expressing the performance of a patient on the indi- 
vidual tests relative to his own general level of performance—a measure 
which makes use of just the sort of information concerning a patient 
available to the clinician after he has administered a test. As a result 
of this analysis, three diagnostic “signs’’ may be tentatively suggested: 

1. Performance on arithmetic reasoning: a relatively low performance 
is characteristic of adult morons as distinguished from schizophrenics and 
controls. 

2. Performance on picture arrangement tests: a relatively poor showing 
is characteristic of adult schizophrenics as distinguished from morons 
and controls. 

3. Performance on comprehension tests: a relatively poor showing is 
characteristic of adult schizophrenics as distinguished from morons and 
controls. 

On the other hand, doubt is cast upon four other “signs’’ which are 
frequently found in the literature: 

















ee 





Limitations in Intelligence Test 397 


1. Score on vocabulary items: a relatively high score is not char- 
acteristic of adult schizophrenics as distinguished from adult morons, for 
both groups score relatively higher than controls on this test. 

2. Score on block designs test: a relatively poor score is not char- 
acteristic of adult schizophrenics as distinguished from morons or controls, 
and both morons and schizophrenics are slightly higher than the controls 
in deviation scores on this test. 

3. General variability: greater variability (inconsistency) of score 
from test to test is not characteristic of schizophrenics as distinguished 
from morons, although both schizophrenics and morons are more variable 
than controls. 

4. Scores on performance and verbal tests: superiority of verbal over 
performance scores does not characterize the schizophrenic group, nor is 
superiority of performance over verbal scores distinctive of the feeble- 
minded. 

These results constitute only a small part of the information which 
is required if systematic use is to be made of intelligence test results as 
aids in the diagnosis of mental disturbance. Since the problem of the 
decline of abilities with age complicates all such psychometric studies 
using adult subjects, similar investigations on patients of younger and 
older ages are desirable. Since the real issue is one of distinguishing 
patients suffering from different disorders, particular care should be 
exercised in the selection of patients whose diagnosis is clear and un- 
complicated. If these investigations are to be of practical value to the 
clinical psychologist, they should make use of a measure which is related 
directly to the observations he makes on his patients in the testing 
situation; the deviation scores used in this and the earlier study with 
schizophrenics would seem logical for this purpose. 

Above all, the failure of the present investigation to verify, for this 
age range and this intelligence scale, certain cues often accepted without 
question as indicative of mental disturbance points to the necessity for 
extreme caution in labelling patients on the basis of test results. There 
is considerable evidence suggesting that mental disturbance affects indi- 
vidual patients in widely different ways, whether its effect be shown in 
informal behavior or in the formal test situation. Whatever may be the 
ultimate outcome of attempts to define general cues for the detection of 
mental disturbance from test performance, it is clear that at present the 
unqualified use of many common cues is unjustified. 


Received June 29, 1948. 
Bibliography 


1, Babcock, H. An experiment in the measurement of mental deterioration. Arch. 
Psychol., N. Y., 1930, 18, No. 117. 








398 Ann Magaret and Clare Wright 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


21. 


8 


Bolles, M., and Goldstein, K. A study of the impairment of “abstract behavior” 
in schizophrenic patients. Psychiat. Quart., 1938, 12, 42-65. 


. Bradway, K. P. Academic achievement in a group of mentally retarded subjects. 


Proc. Amer. Ass. Stud. ment. Def., 1939, 44, 154-162. 


. Davidson, M. A study of schizophrenic performance on the Stanford-Binet scale. 


Brit. J. med. Psychol., 1937, 17, 93-97. 


. Harris, A. J., and Shakow, D. The clinical significance of numeral measures of 


scatter on the Stanford-Binet. Psychol. Bull., 1937, 34, 134-150. 


. Jastak, J. Psychometric patterns of state hospital patients. Delaware St. med. J., 


1937, 9, 87-91. 


. Johnson, A. P., and Fernow, D. L. Comparison of results of Stanford-Binet and 


performance tests given at the Dixon State Hospital. Proc. Amer. Ass. Stud. 
ment. Def., 1939, 44, 103-109. 


. Kendig, I., and Richmond, W. V. Psychological studies in dementia praecoz. Ann 


Arbor: Edwards, 1940. 


. Kohs, 8. C. Intelligence measurement: A psychological and statistical study based 


upon the block-design tests. New York: Macmillan, 1923. 

Lidz, T., Gay, J. R., and Tietze, C. Intelligence in cerebral deficit states and 
schizophrenia measured by Kohs block test. Arch. Neurol. Psychiat., Chicago, 
1942, 48, 568-582. 

Magaret, A. Parallels in the behavior of schizophrenics, paretics, and pre-senile 
non-psychotics. J. abnorm. soc. Psychol., 1942, 37, 511-528. 

Malamud, W., and Palmer, E. M. Intellectual deterioration in the psychoses. 
Arch. Neurol. Psychiat., Chicago, 1938, 39, 68-81. 

Merrill, M.A. On the relation of intelligence to achievement in the case of mentally 
retarded children. Comp. Psychol. Monogr., 1942, 2, No. 10. 

Miles, W.R. Psychological aspects of aging. In Cowdry, E. V., Problems of aging. 
Baltimore: Williams & Wilkins, 1942. Pp. 756-785. 

Piotrowski, Z. A. A comparison of congenitally defective children with schizo- 
phrenic children in regard to personality structure and intelligence type. Proc. 
Amer. Ass. Stud. ment. Def., 1937, 42, 78-90. 

Pressey, 8. Distinctive features in psychological test measurement made in de- 
mentia praecox and chronic alcoholic patients. J. abnorm. soc. Psychol., 1917, 
12, 130-139. 

Pressey, 8., and Cole, L. Irregularity in a psychological examination as a measure 
of mental deterioration. J. abnorm. soc. Psychol., 1918, 13, 285-294. 

Rabin, A. I. ; Differentiating psychometric patterns in schizophrenia and manic- 
depressive psychosis. J. abnorm. soc. Psychol., 1942, 37, 270-272. 

Rabin, A. I. Test-score patterns in schizophrenic and non-psychotic states. J. 
Psychol., 1941, 12, 91-100. 


. Schwarz, R. Measurement of mental deterioration in dementia praecox. Amer. 


J. Psychiat., 1932, 12, 555-560. 
Wechsler, D. The measurement of adult intelligence. Baltimore: Williams & Wil- 
kins, 1941. 


. Wechsler, D., Israel, H., and Balinsky, B. A study of the sub-tests of the Bellevue 


intelligence scale in borderline and mental defective cases. Amer. J. ment. Def., 
1941, 45, 555-558. 


. Wittmann, P. The Babcock deterioration test in state hospital practice. J. 


abnorm. soc. Psychol., 1933, 28, 70-83. 


. Wright, C. The nature of the decline of performance abilities in adult morons as com- 


pared with that of normal adults. Unpublished Doctor’s Dissertation, Stanford 
Univ., 1943. 





A Test Battery for Identifying Potentially Successful 
Naval Electrical Trainees 


C. H. Lawshe, Jr., and G. R. Thornton 
Division of Education and Applied Psychology, Purdue University 


The primary purpose of the research ' reported in this paper was the 
development of a battery of tests which might be useful in identifying 
those individuals who are most apt to be successful in a Navy Training 
School for electricians. Secondarily, results of the research have already 
served the following purposes: 

1. To provide information to supplement the personal interview in 
the selection of section leaders. 

2. To identify those trainees who are in need of individual help includ- 
ing coaching and personal counseling. 

3. To supply members of the instructional staff with information 
concerning the present educational and experience levels of the several 
trainees. 

The Training Program. The Purdue Naval Training School, one of 
several of its kind, is operated for the purpose of training Electrician’s 
Mates, Third Class. Enrollees come from basic Naval training stations 
where they have had from four to ten weeks of “boot” training but no 
specific electrical training other than that which they may have had at the 
time of induction into service. Certain psychological tests are adminis- 
tered at the basic station; test scores together with personal data and 
interview information are considered in allocating trainees to this and 
other schools. 

The Purdue Naval Training School receives a company of approxi- 
mately 200 trainees once each month. Each trainee spends fifteen weeks 
in the school, at the close of which time he is “rated’’ as Electrician’s Mate, 
Third Class, provided he successfully completes the program. Some are 
eliminated from the program after the fourth week for scholastic reasons, 
and some complete the fifteen weeks but do not qualify fora rating. The 


1 The investigation was made possible through the cooperation of Professor C. W. 
Beese, Director of the Purdue War Training Program; Dr. D. A. Scott, Supervisor of 
the Purdue Naval Training School; and Lieutenant Commander H. M. Hart, Com- 
manding Officer of the Purdue Naval Training School at the time the research was 
started. The present study was a local one and its publication here in no way implies 
Navy endorsement or policy. The authors are also indebted to Mr. Charles Turner and 
Mr. T. D. Peterson for much of the statistical work. 


399 





a 


400 C. H. Lawshe, Jr., and G. R. Thornton 


curriculum includes the following five courses: Electrical Laboratory, 
Wiring Laboratory, Tool Laboratory, Shop Mathematics, and Electrical 
Theory. The various courses are taught by different instructors, and the 
program follows a carefully outlined procedure. 





Experimental Procedure and Results 


The Original Test Battery. As stated above, the allocation of trainees 
to the school is based, in part, upon their scores on certain tests *? which 
are administered at the basic Naval training station. These regularly 
used tests are: 





Test A—a commercially available mechanical aptitude test. 
Test B—a commercially available general classification test. 
Test C—for measuring mechanics in English. 

Test D—for measuring simple skills in arithmetic. 

Test E—for measuring proficiency in spelling. 


Following an analysis of the training program these instruments were 
supplemented with the following three additional tests, administered after 
the trainees arrived at Purdue: 


Test F—a 15 minute mental alertness test. 

Test G—a test designed to evaluate an individual’s ability to read 
simple measurements and solve simple arithmetical problems. 

Test H—a test designed to measure practical electrical information. 


While the first two are commercially available tests, the test of practical 
electrical information was designed specifically for use in this program. 
In its experimental form the test consisted of 125 true and false items 
dealing with practical electrical information. Companies 3, 4, and 5 were 
administered the experimental form, after which papers belonging to the 
100 men with the highest grades in the school and to the 100 with the 
lowest grades were segregated. The ‘Kelley technique” of item analysis 
as described by Lawshe* was applied and D-values werefound. Fifty-five 
items, having a minimum D-value of .4 and a mean D-value of .6, were 
retained in the final form which was administered to subsequent groups. 

Test F was administered to 197 men in Company 5; all other tests 
(including the first five) were administered to 587 men in Companies 3, 
4,and 5. All tests were administered prior to the beginning of training. 

The Criterion. The grade point average (GPA) earned in the sehool 


? The names of all tests reported in this paper are withheld in compliance with the 
request of the United States Navy. The names of the commercially available tests will 
be supplied to professionally trained persons upon request to the authors. 

*C. H. Lawshe, Jr., A nomograph for estimating the validity of test items. J. appl. 
Psychol., 1942, 26, 846-849. 











Test for Naval Electrical Trainees 401 


was used as the criterion of proficiency. This consisted of the mean of 
the weekly grades earned in each of the five courses already mentioned. 
Since the program continued for 15 weeks, each trainee received 75 grades 
during his training. Fifteen of these were issued by each of five different 
instructors. In a few instances where the individual dropped out before 
the completion of training, it was necessary to compute the GPA with 
fewer than 75 grades. Weekly grades were based largely but not entirely 
upon objective tests which were undergoing revision and standardization 
at the time the study was being carried on. The grading system was the 
one universally used by the Navy. Grade point averages ranged from 
1.9 to 3.9, with a mean of 3.1. (The highest possible grade point average 
was 4.0, and 2.5 was designated as passing.) The criterion was an ex- 
ceptionally stable one. In Company 5, average grades received on ‘‘odd”’ 
weeks were correlated with average grades received on ‘“‘even’’ weeks and 
a coefficient of .968 was obtained. This value when “stepped up” by 
the Spearman-Brown prophecy formula yields .984 as an estimate of the 
reliability of the GPA. 

Intercerrelations. Criterion data together with test scores, company 


Table 1 
Intercorrelations Between Eight Tests and Criterion 





Criterion (GPA) and Tests 








Tests GPA E D C B A H F 
Test G .712 .245 428 418 .361 485 .393 .595 
.660 .196 408 317 275 366 364 — 
Test F .662 .346 .310 391 .576 .360 371 
Test H .579 101 .123 .154 .394 .630 
.537 .095 .143 175 .286 .585 
Test A .524 .012 114 .170 .405 
441 .081 .133 .185 315 
Test B 521 .320 .268 417 
401 .273 .224 .398 
Test C 449 411 .319 
.353 424 .320 
Test D 411 .336 ’ 
356 .339 
Test E .260 
.208 





Note: Values in boldface are based upon an N of 197; others upon an N of 587. 








402 C. H. Lawshe, Jr., and G. R. Thornton 


number, and other information were punched on I. B. M. cards. All test 
scores were correlated with the criterion and with each other. These 
correlation coefficients are presented in Table 1. Because seven test 
scores were available on 587 subjects and eight test scores were available 
on 197 subjects, two sets of values were computed, those for the 197 
subjects being shown in boldface in the table. 

Selecting the Optimum Battery. Because all eight test scores were 
available only for Company 5 (N = 197), the data from that group were 
treated separately. The five tests which correlated highest with the 
GPA criterion, namely Test F, Test G, Test H, Test B, and Test A, were 
treated by the Wherry-Doolittle technique in order to find the maximum 
shrunken multiple correlation with the criterion. As is indicated in 
Table 2 each of the five tests contributes something. However, since the 


Table 2 
Shrunken Multiple Correlations Obtained by Wherry-Doolittle Technique 





Shrunken Multiple R 








Test Company 5 Company 3, 4, 5 
Test G .712 .660 
Tests G, H .782 .732 
Tests G, H, F 817 .750 
Tests G, H, F, B 822 .754 
Tests E, H, F, B, A 828 756 

N 197 587 





last two raise the shrunken R only from .817 to .828, and because of the 
additional labor involved in adding more variables, it was felt that satis- 
factory results could be achieved by using the first three tests. The three 
tests retained have a total administration time of 65 minutes. 

Beta weights were computed, and the regression equation for predict- 
ing the GPA from the three tests was found to be: 


GPAprea. = .027(Xg) + .014(Xu) + .024(X¥) + 1.81 


Similar computations were made utilizing the data from all three 
companies (N = 579). However, since scores on the Test F were not 
available for the whole group but only for 197 subjects, r’s obtained with 
the smaller population were added to the matrix. The Wherry-Doolittle 
technique was again applied, and the shrunken multiple R values obtained 
are presented in Table 2. It will be noted that the rank order in which 
the several measures are added is identical for both groups but that all 
values for the larger group run consistently lower. Again utilizing the 








Test for Naval Electrical Trainees 403 


best three tests, beta weights were computed, and the regression equation 
was found to be: 


GPAprea. = .024(Xcq) + .012(Xu) + .024(Xr) + 1.93 


One possible explanation as to why values on the larger group are 
smaller than are those obtained with the single company can be advanced. 
It has already been stated that objective tests which figured materially in 
the assignment of grades in the training program were in the process of 
standardization while the study was in progress. This is evidence to 
support the belief that the criterion with Company 5 is more stable than 
with Companies 3 and 4. It was decided, however, to test the two equa- 
tions empirically in order that the best possible prediction might be 
achieved. 


Validation of Battery with New Group 


Procedure. In order to evaluate the two regression equations, data 
from a new company were employed. Company 6, like previous com- 
panies, was tested prior to training with the three tests. Predicted 
GPA’s for each of 200 trainees were computed by means of both equations. 
Each set of predictions in turn was correlated with actual GPA’s earned 
during the fifteen weeks of training. Correlation coefficients were found 
to be .817 when the equation derived from the data on 587 cases was 
employed and .819 when the equation derived from the data on 197 cases 
wasemployed. Since there was no significant difference between the two, 
a choice was made in favor of the latter because it was felt that the cri- 
terion with this group was more nearly representative of what could be 
expected with future companies.‘ 

Predictive Value of Battery. In order to examine further the predictive 
value of the battery, differences between actual GPA’s and predicted 
GPA’s for Company 6 were tabulated and have been presented graphi- 
cally in Figure 1. Here it is shown that no difference existed in 23 per 
cent of the cases, a difference of .1 grade point (about one fifteenth of the 
range) existed in 27 per cent of the cases, and so on. Or to express the 
variations in another way, 50 per cent of the predictions were within an 
error of .15 grade point, 70 per cent were within .25 zrade point, and 88 


‘ The correlation between actual GPA’s and predicted GPA’s for Company 7 was 
later found to be .640. The lower coefficient is accounted for by the fact that the 
members of Company 7 were more homogeneous as to grades received. The standard 
deviation of GPA’s was .42 for Company 6 but only .32 for Company 7. Differences 
between actual GPA’s and predicted GPA’s for Company 7 were found to be distributed 
in virtually the same manner as those shown in Figure 1 for Company 6, and the standard 
error of estimate for Company 7 was found to be .246 as compared with .240 for Com- 
pany 6. 





404 C. H. Lawshe, Jr., and G. R. Thornton 


per cent were within .35 grade point. These latter figures become more 
meaningful when compared with similar figures which would obtain 
provided the mean GPA were predicted in every case. Here 12.5 per cent 
of the predictions would be within an error of .05 grade point, 31 per cent 
within .15 grade point, 45.5 per cent within .25 grade point, and 61.5 
per cent within .35 grade point. 

The standard error of estimate is sometimes a better measure of pre- 
dictive value than the correlation coefficient cited above. The standard 
error of estimate 5 was found to be .240, which is to say that in 67 per cent 















































.O 23% 
on .l 50% 
ay, 
2f- } 70% 
2 2 
os. 3 % —] 88%, 
ss 
g@ 4 “] 94% 
53 5 "] 96% 
oO 
80 .6 |] 98% 
wW 
53 A ] 98.5% 
=¢ .8 J995% 
.9 ]100% 





Fig. 1. Graph showing the percentages of differences between actual and predicted 
grade point averages (solid bars) and the cumuiative percentages indicating the propor- 
tion of difference failing within various limits (total length of bars). 


of the cases the predicted GPA will be within .24 grade point of the 
actual GPA. This value when compared to the standard deviation of 
actual GPA’s of .42, represents a reduction of 43 per cent in the errors of 
prediction. 


‘The standard error of estimate was first calculated by the formula, cyz 
= oy V1 — rz,*. Thecorrelation used in this formula gives a measure of the co-variation 
of the two sets of scores but does not take into account errors of prediction that may be 
common to all cases. For example, if all the predicted scores were one unit less than 
the actual scores, correlating the two would not show such a fact. For this reason the 
standard error was also calculated from the obtained differences between predicted 
GPA’s and actual GPA’s (Company 6) and found to be .247 as compared with .240 
presented above. 








Test for Naval Electrical Trainees 405 


Results to be Expected. The value which would be derived should the 
test battery and the regression equation be employed for the selection of 
future trainees is illustrated in Figure 2. In this graph, based on data 
from Company 6, the base line represents various predicted GPA’s that 
might have been set up as admission requirements, while the vertical axis 
represents the proportion who would have attained or exceeded specific 
actual GPA’s: For example, suppose that a minimum predicted GPA 
of 3.1 had been required for admission; where 36 per cent of these trainees 


OR EXGEED CERTAIN GRADE POINT AVERAGES 


PERCENTAGE OF THOSE ADMITTED WHO WILL EQUAL 





25 28 3.1 3.4 3.7 
WIT MINIMUM PREDICTED GRADE POINT AVERAGE REQUIRED 
NEW TESTS 


Fie. 2. Curves showing the percentage of trainees who would have received actual 
grade point averages above certain levels had successively higher predicted grade point 
averages been required for admission. 


actually attained or exceeded 3.4, 49 per cent would then have attained 
or exceeded this level. Likewise, where 84 per cent actually attained or 
exceeded 2.8, 99 per cent or practically all would have attained or exceeded 
this level. Since 2.5 is defined as passing, these facts seem quite signifi- 
cant. Similar values may be read from the graph for other minimum 
predicted GPA values. 


Summary and Conclusions 


Approximately 600 Naval electrical trainees were tested prior to the 
start of fifteen weeks of training. Test scores were correlated with GPA’s 














406 C. H. Lawshe, Jr., and G. R. Thornton 


(grade point averages) earned during the training period, and the Wherry- 
Doolittle technique was applied in order to select the best combination 
of tests. The three tests with the greatest predictive significance, in their 
order of importance, were found to be: Test G (a test designed to evaluate 
an individual’s ability to read simple measurements and solve simple 
arithmetical problems), Test H (a test designed to measure practical 
electrical information), and Test F (a 15 minute mental alertness test). 
The total administration time is 65 minutes for these three tests. 

By means of a regression equation, predicted GPA’s for a new group 
of 200, also tested before training, were computed. The following results 
and conclusions are supported: 

1. Predicted GPA’s for the new group correlated .82 with actual 
GPA’s. 

2. The regression equation predicted GPA’s without error in 23 per 
cent of the cases, within an error of .15 grade point (one fifteenth of the 
range) in 50 per cent of the cases, within .25 grade point in 70 per cent of 
the cases, and within .35 grade point in 88 per cent of the cases. 

3. Application of the regression equation to a new company indicates 
that should a minimum predicted GPA of 3.0 or 3.1 be established as an 
admission requirement, virtually all failures could be eliminated. 


Received April 24, 1943. 





Studies in International Morse Code 
I. A New Method of Teaching Code Reception * 


Fred S. Keller 
Columbia University 


The method of instruction described herein was developed during the 
past two years in the Psychology Department of Columbia University, 
and represents the combined efforts of a number of students and teachers 
who were interested in contributing to the military and civilian training 
program in radio communication.! In its present form, the method ap- 
parently possesses certain advantages over the conventional devices for 
teaching beginners to recognize quickly and accurately the basic sound- 
patterns or signals of International Morse Code. In addition, it shows 
promise as a useful experimental instrument in the solution of*code- 
learning problems and, perhaps, of learning problems in general. 

The distinctive aspect of this method will be readily appreciated by 
psychologists as a modification of the well-known procedures of “paired 
associates” and “regular reinforcement.’”’ Essentially, it involves (1) the 
presentation of a Morse-Code signal to the ear of the student, (2) a short 
pause in which the student is given the opportunity to write the character 
(letter or digit) which corresponds to the signal, and (3) an identification 
of the signal by the instructor. 

The mode of initiating and transmitting either the signals or the 
voiced identifications may vary considerably, depending upon available 
facilities; the number and variety of signals in a single practice session 
may be altered by the instructor as he wishes; and the announcement of 


* The method herein described has since been adopted, in slightly modified form, 
by the U. 8. Army Signal Corps as one of two official code-training devices. See War 
Department Technical Manual (TM 11-459), “Instruction For Learning International 
Morse Characters,”’ June 2, 1943. 

1 This work was initiated as a result of the encouragement and helpfulnes. of Dr. 
M. R. Trabue, Chairman of the National Research Council’s Emergency Sub-Committee 
on Learning and Training. The following men were of especial assistance in suggesting 
or executing exploratory studies: Dr. Spaulding Rogers (Hofstra College); Mr. R. E. 
Taubman, a graduate student in psychology at Columbia; and Prof. R. P. Youtz 
(Barnard College, Columbia). Development of the method for group instruction was 
furthered by the kind services of Mr. H. P. Bechtoldt of the Personnel Procedures 
Section of the Adjutant General’s Office, Washington, D. C., and by the hearty coopera- 
tion of officers and men of the Signal Corps Replacement Training Center, Camp Charles 
Wood, N. J. 


407 











408 Fred S. Keller 


characters may be simply in terms of ‘“‘A,” “B,” “C,” etc., or of certain 
“phonetic equivalents” of the characters (‘Able,’”’ ‘Baker,’ ‘‘Charlie,”’ 
etc.). Even the interval between signal and identification, or between 
one identification and the subsequent signal, may be altered. The funda- 
mental sequence of events—signal . . . pause . . . identification—alone 
marks the method as different from other training procedures, most of 
which merely present to the student a slow succession oi signals until he 
is able, through reference to some previously learned visual dot-dash or 
“di-dah” formulation, to respond adequately to a prescribed number of 
successive signals. 

Although modifications of this signal-and-voice method have been 
employed in a number of military and civilian training centers, the present 
account will deal exclusively with the procedure used in code classes at 
Columbia University during the present college year. These classes were 
composed of Columbia College undergraduates, and approximately one 
hundred and fifty men, ranging in age from seventeen to twenty-three, 
received instruction daily, five afternoons a week, in fifty-minute practice 


Table 1 
“Phonetic Equivalents” of the Alphabet 





Able Fox King Peter Uncle Zebra 
Baker George Love Queen Victor 

Charlie How Mike Roger William 

Dog Item Nan Sugar X-ray 

Easy Jig Oboe Tare Yoke 





sessions.? Due to limitations in space and equipment, no more than 
thirty students received instruction at any one time. 

During the first class hour, after the students were informed of the 
general aims and requirements of the course, they were given a black- 
board demonstration of the mode of lettering to be used throughout their 
training period. The standard U. S. Army Signal Corps method of 
printing was illustrated, and particular stress was laid upon the correct 
formation of those letters and digits which are commonly a source of 
confusion in reading printed copy—e.g., 2 and Z, 1 and I, Zero (@) and 0, 
etc. 

Subsequently, the students were supplied with a mimeographed list 
of the words used by the Signal Corps in telephonic communication as the 
“phonetic equivalents” of the letters of the alphabet. Official modifica- 


2 Students in these classes received two points of college credit if they achieved a 
ten-word-per-minute level of code-reception during the Winter Session (about seventy 
hours of instruction). In Spring Session beginners’ classes, this requirement was raised 
to twelve words per minute. 


Studies in International Morse Code 409 





tions of this list have been made recently, and Table 1 gives the key-words 
which are now in use at Columbia. With respect to the digits, for which 
there are no equivalents, they were told of the orderly succession of “‘dits”’ 
and “‘dahs” (the conventional representatives of signal dots and dashes) 
which prevails as one passes from 1 to Zero. This was done merely for the 
purpose of putting all students on an equal footing with respect to this 
knowledge, thus minimizing any advantage in learning which might ac- 
crue from the recognition by some of the students of the “logic’’ of the 
arrangement. Practically no reference to dits and dahs, or dots and 
dashes, was made during their later instruction. The students were en- 
couraged, rather, to hear the code signals as unitary and not as a combina- 
tion of elements—a feat that is more difficult with the digit signals than 


with the alphabet signals, because of the long-drawn-out nature of some 
of the former. 






































































































































mete x's) yee NN oo eS ieee ele aa IN rae adh eo 
OTAT TSivi THis] IN HIETHI4] LAINIHINIV NIS/8 
Si {2Z1Di4) [LI TITIGIP] [FIQi lylé} [Tul im R Wwix 
Tigis x NIHINIZIA] [2] [BILIO} 1210] [x 
z LISICIAVI9] [APZICILIM] IRIYIO]7iW 1jul ip 
AITIE H 2ITINIGIV C Tl IPIHISIHIS 
QiEIS CIGI4/316] [LITHIA JIFIK[5|0 al ie 
E uli TNIWETW] [Nlo[x[Cli] [2]Hi RIT Vig 1) 83 
Tlul7IKix| (Simic! fo aial4| (SIG [7 5|wiale 











Fig. 1. Daily practice sheet. 


After this introduction to lettering and phonetic equivalents, the 
students were provided with mimeographed record-blanks or practice 
sheets, and instructed as to the manner in which these sheets were to be 
used later in “copying” code. A sample form is shown in Figure 1. 
Following a brief mention of the importance of printing name, date, and 
time on each form used, the students were given substantially the follow- 
ing instructions: 


“This is the type of record blank which you will use throughout your train- 
ing in code reception. It consists of double rows of small squares, broken down 
into blocks of ten. The manner of filling in these squares is very simple. In 
each practice session, you will hear a succession of one hundred code signals, 
presented in random order. To each of these signals you will try to respond by 
printing the appropriate character in one of the squares in the upper row within 
each block, working from left to right across the page. 

The signals will be presented one at a time, and you will be given about three 
seconds in which to respond to each; then the instructor will announce the letter 
or digit that you should have printed during the three-second period. This 








410 Fred S. Keller 


announcement will let you know whether you were right or wrong in responding 
to the signal as you did. 

Immediately after the instructor’s announcement, the next signal will be 
sounded, and you will attempt to identify it just as you did the one before, and 
again the instructor will name the character after a three-second pause. This 
will be followed by another signal, another pause, and another announcement; 
and the whole procedure will be repeated until you have heard one hundred 
signals, each of which you tried to recognize, and each of which was named 
correctly by your instructor. 

If all your responses during a practice session are correct, the four upper rows 
of squares on your record sheet will be filled, and the four lower rows will be 
empty. At first, however, you will probably make a great many errors. That 
is, you will fail to respond correctly to some of the signals, and you will not be 
able to respond at all to others within the allotted time. The procedure in case 
of either an erroneous or omitted response may be shown best by a concrete 
example. (See Figure 1.) 

Let us suppose that, in a practice run of one hundred signals, the first 
signal given was an ‘S,’ but you mistook it for an ‘O’ and responded as shown in 








41 [26[27 114 










































































Fie. 2. Box score. 


the first upper square of the sample form. When you heard the instructor’s 
‘Sugar,’ you should then have printed ‘S’ in the square directly below the one 
containing your incorrect ‘O.’ To the next signal let us suppose that you 
responded with an ‘A,’ which was confirmed, at the end of the usual three 
seconds, by the instructor’s ‘Able,’ making it unnecessary for you to make any 
correction in the lower square. With respect to the third signal, we may sup- 
pose that you were unable to make any response within the three-second 
interval, and that the instructor then identified it as ‘Two.’ As in the case of 
the ‘S,’ you would then have placed a ‘2’ in the square immediately beneath 
the one which you failed to fill in before the announcement. 

In this fashion you would have proceeded throughout the entire 100-signal 
run, using the lower squares only in the case of an incorrect or omitted response. 
At the termination of the run you would have been able to add up your errors 
very quickly by totalling the number of entries in the lower squares of each row. 
This total is given in the right-hand margin of the sample form.” 


Following this description of the use of the record blanks, the students 
were introduced to the ‘‘box-score” sheets, one of which is represented 
in Figure 2. On these forms were recorded, throughout the entire train- 
ing period, the number of errors made in each hundred-signal run. Thus, 





Studies in International Morse Code 411 


in Figure 2, the entries provide, at a glance, a picture of the student’s 
progress from run to run and hour to hour of practice. These sheets were 
handed out daily to the students at the beginning of the hour and were 
returned by the students to the instructor before they left the code room. 
The entries were made by the students following each practice run, and 
corresponded to the marginal entries on the record blanks. 

Finally, the students were cautioned against falsifying progress records 
on either the practice sheets or box scores. The present method of 
instruction obviously leaves open the possibility of cheating, and it also 
permits occasional really doubtful cases to arise—cases in which the 
student is uncertain whether he responded to a given signal before or after 
the voiced character. There is no way of avoiding these doubtful cases 
which, fortunately, are rare; and it would be difficult to keep a chronic 
cheater from filling in upper squares after the announcements, if he so 
desired. Happily, again, however, such cases have not yet been detected 
in the code room. Apparently the high motivation to learn, the willing- 
ness to “play the game,” the possibility of ‘“‘no-voice’”’ runs at critical 
points in training, and the presence of assistants or other students beside 
them at the code tables, were sufficient to minimize or eliminate entirely 
this type of behavior. A daily no-voice run would, of course, be feasible 
in any training situation where wide-spread faking was suspected. 

Summarizing briefly, the first class hour was devoted primarily to 
familiarizing the students with the details of the method to be used in 
their later practice periods. Thus, they were acquainted with the correct 
mode of printing characters, the standard phonetic equivalents to be used 
by their instructor, and the proper use of practicé sheets and box scores. 
In addition, they were warned against misrepresenting their true progress 
in the course by altering their error records. 

At the second meeting of the class, following a few minutes’ review of 
the preceding hour’s instructions, code practice began. Before any actual 
records were taken, however, the students were given a single introduction 
to the sound-pattern of each of the thirty-six characters. The signals 
were transmitted with a standard telegraph key by a skilled radio opera- 
tor. Each signal was sent at a speed which, if the standard interval 
between successive signals was maintained, would approximate a rate of 
eighteen to twenty words per minute. This speed of signal was main- 
tained, as closely as possible, throughout the entire training period. 

A small commercial audio oscillator with a built-in magnetic speaker 
was used as a sound source. Variable energy and frequency control 
made it possible to provide signals of acceptable loudness and pitch. 
(Improved equipment, now in use, permits the students to receive through 
ear-phones both code signals and voiced characters.) No marked dis- 








412 Fred S. Keller 


tortion of signals due to reverberation was noted in the code room within 
the listener range employed. 

As indicated above, the regular training procedure was exceedingly 
simple. A code signal was sounded; about three seconds later, the appro- 
priate phonetic equivalent was announced; and, one second after the 
announcement, the next signal was presented. The manner of timing 
the intervals was relatively crude; a fairly reliable method of estimating 
the three-second pause consisted in the sub-vocal repetition, by the in- 
structor, of some familiar word (e.g., “chimpanzee’’) at a suitable tempo. 
Occasional reference to a stop-watch was sufficient to keep the tempo 
adjusted satisfactorily. 

A practice run of signals contained one hundred signal-and-voice 
sequences of this sort; and during a single class hour it was usually possible 
to give four of these runs, each followed by a short rest-pause. During 
this pause, the students totalled their errors and made the appropriate 
entries in their box-score sheets. A rapid summation of errors was per- 
mitted by the fact that all corrections, either of erroneous or omitted 
responses, were made in the lower squares of each double row on the record 
blanks. These corrections served also to show the individual student 
which of the signals were giving him especial trouble, thus enabling him 
to concentrate upon them in later runs. 

The “whole method’’ of instruction was employed. All thirty-six 
signals were presented in random order within each run. (This proce- 
dure may or may not be superior to the more conventional methods, 
which ordinarily deal with small groups of signals at a time, but it was 
adopted for use with this class for reasons which will be presented in a 
later study.) No one sequence of signals was repeated in successive 
runs, and very rarely was any signal presented twice in succession. 
Approximately equal representation of the signals was thus provided in 
the course of training for each student. 

The criterion of mastery to be reached by each student was arbi- 
trarily set at three successive runs in each of which his responses were 
at least ninety-five per cent correct. Although this criterion proved 
satisfactory, in terms of rapidity of progress at later stages of instruc- 
tion, a more rigorous requirement might well have been chosen. Many 
of the students, in spite of their objective record of mastery, reported a 
lack of confidence in their ability and showed an unwillingness to move 
on to a higher level of training. Moreover, a certain degree of over- 
learning at any level may be advantageous. 

Table 2 summarizes data on the time required, by a group of forty- 
eight inexperienced students, to reach the three-run criterion mentioned. 
The average number of hours required was 8.8 (range, 4-17). Four 








Studies in International Morse Code 413 


students failed to reach the criterion within twenty-five hours, but were 
advanced to the five-word-per-minute level nevertheless, in order to 
reduce the instructional load. 

In comparison with available reports of results from code schools in 
which the ‘“‘old’’ method is used, these figures indicate a very high rate 
of progress. Until, however, a more direct and controlled investigation 
is made, in which large groups are equated with respect to such factors 
as code aptitude, intelligence test scores, hours of instruction daily, and 
so forth, caution should be exercised in making assertions about the 
intrinsic superiority of the present method of instruction. 

Certain definite merits of the procedure, which are sufficient to war- 
rant its serious consideration as a training device, may nevertheless be 
pointed out. The effectiveness of regular, fairly immediate, reinforce- 
ment of each response to a discriminative stimulus is well established as 


Table 2 


Hours of Training Required by 48 Students to Master 36 Morse Code 
Signals by the Whole Method 





Hours to Master Number of 
36 Characters Students 





1-4 

5-8 

9-12 
13-16 
17-20 





a psychological principle. Certainly it has today more adherents than 
has the “law of frequency,”’ the belief in which might easily have been 
the basis of the older methods—which are often defended with the state- 
ment that long-continued hearing of signals is the primary condition 
of code learning. The number of signals to which students are exposed 
under the older methods is many times greater than under the one here 
described, but the effectiveness of training, according to present reports, 
is considerably less. At what point the reinforcement no longer depends 
on the instructor’s announcement of characters is still undetermined for 
any of the stimulus-signals, but there would appear to be little question 
of the value of the verbal confirmation of responses at the initial stage 
of instruction. 

Motivation, under the present method, is maintained at a high level 
throughout training. This is apparent, not only in the general behavior 
of the students in the code room and their reports of intense interest in 
the task, but also in the learning curves—either group or individual— 








414 Fred S. Keller 


that have been plotted. Quite commonly, and particularly in the case 
of superior students, a straight-line curve best represents the relation 
between errors and successive runs, and only rarely are there depressions 
in the curves which might argue for a loss of “drive.”” This maintenance 
of motivation is clearly related to the fact that the student can observe, 
in his record sheets and box scores, the regular decrease of erroneous 
responses. At no time is he kept in doubt for long concerning the reality 
of his progress. 

Other practical advantages of the method are more or less auxiliary 
and hold especially when the method is used in training men for work in 
military communications. (a) The printing of letters and digits, of 
standard form and size and in cipher groups of five (commonly used in 
military messages), is encouraged and aided by the use of the squares 
and blocks of the practice sheets. (b) Facility in the recognition and 
use of the official phonetic equivalents is achieved by the student with- 
out effort, as a by-product of the instructor’s regular announcement of 
characters. (c) Good use may be made of both practice-sheet and 
box-score records by the instructor in keeping in touch with the progress 
of individual students, in the construction of group curves or charts for 
code-room display, in keeping attendance records, and so forth. 

Serious defects in the method are not yet apparent, but one source 
of minor difficulty for some students thus trained seems to exist at the 
stage of transition to, say, a five-word-per-minute level of reception, in 
which signals are presented in closer sequence, without any identification 
until the end of the run. The temporary confusion which results occa- 
sionally may be greatly reduced, if not eliminated entirely, by one or 
more simple variations of the initial method. For example: (a) students 
who have reached the criterion of mastery with the recommended three- 
second interval in which to make each response, may be given runs in 
which the interval is reduced to two seconds, until a new level of pro- 
ficiency is reached; (b) the criterion itself may be raised, thus providing 
for over-learning and a related decrease in reaction-times to specific 
signals; (c) following initial mastery, or even before, signals may be sent 
in pairs, at an over-all rate of about ten words per minute, with paired 
announcements after the usual three-second pause; (d) no-voice runs at 
a three- to four-word-per-minute level may occasionally accompany the 
regular runs at a point near mastery. There is warrant for believing 
that each of these procedures is effective in preparing the student for the 
mode of instruction which will be employed at higher code speeds. 

Mention has been made of the usefulness of the method as a device for 
furthering research on code-training problems. A single instance may 
be cited here in support of this statement. A question of basic impor- 





Studies in International Morse Code 415 


tance to the economizing of time spent in code learning is that of the 
relative “difficulty” of the signals, and the reasons therefor. The answer 
to this question, which is essentially a problem of stimulus generaliza- 
tion, should pave the way to more efficient selection of signal groups 
(when the “part method” is employed), better weighting and arrange- 
ment of signals in any practice sequence, and increased understanding 
of the differences between low- and high-speed code reception. An ap- 
proach to this analysis of difficulty lies in the examination of errors made 
by students in the actual training situation. The present method makes 
available, for the first time, the material required for such an examination. 


Received June 1, 1943. 





























Occupational Differences in Manipulative Performance of 
Applicants at a Public Employment Office 


Lorene Teegarden 
Department of Psychological Services, Cincinnati Public Schools 


The principal purpose of a testing program in an employment office 
is to aid in the selection of applicants to be recommended for specific 
jobs. Manipulative tests which were used at the Cincinnati Employ- 
ment Center were described in an earlier report. Applicants who took 
the tests were given percentile ratings on their performance. This was 
an evaluation of their performance in comparison with that of other 
young white adults of the same sex who were tested. Norms on which 
such ratings were based were presented in the article referred to above. 

In addition, each applicant’s performance was studied in relation to 
that of other applicants tested, who had had experience in the same 
occupation or occupations as those claimed by the applicant. 

In the study of occupational differences it was not possible to deter- 
mine the applicant’s efficiency on any job, nor to make any distinction 
between different grades of jobs, as between waitresses in high grade or 
low grade restaurants. Nor was it possible to verify the length of ex- 
perience. Applicants knew that their statements would be checked by 
the Employment Center through reference letters to former employers, 
and it was assumed that the records given in their applications were 
sufficiently accurate to use in the employment study. Experience at 
part time work was estimated in terms of the time actually worked on 
the job. Thus a half time job for six months was tabulated as three 
months experience. A job after school and all day on Saturday was 
considered equivalent to half time, and other part time jobs were esti- 
mated in a similar manner in terms of time actually employed. Jobs 
of less than one month were excluded. If a person had had several jobs 
of a similar character, his experience was counted as the total for all 
those jobs. 

From the above account it will be seen that each occupational group, 
taken as a whole, includes records of long and of short experience, of the 
efficient and the inefficient, the well adjusted and the poorly adjusted, 
those who left for better jobs as well as those who left for no definite 


1 Lorene Teegarden. Manipulative Performance of Young Adult Applicants at a 
Public Employment Office. J. appl. Psychol. 1942, 26, 633-652; 754-769. 


416 


Differences in Manipulative Performance 417 





reason, and those who were discharged for any reason or laid off because 
work was slack. Moreover, the same individual’s record may be in- 
cluded in different occupational tables, in which he may have had 
different amounts of experience or different degrees of success. Despite 
these conditions which would tend to increase the similarity of distri- 
butions for different occupations, a study of the data reveals distinct 
differences. 

All ages were included in the occupational studies. The majority of 
the records, however, are those of young adults between the ages of 16 
and 25 years. No records of colored applicants are included. 


Occupational Distributions of Test Performance 


Tables 1 and 3-10 show the distributions in certain occupations of 
men. Tables 11 and 13-18 present similar data for women’s occupa- 
tions. The 10th, 25th, 50th, 75th and 90th percentile points are given 
for each occupation, not in terms of actual performance time at that 
level, but in terms of the corresponding percentile level on the general 
norms which would be assigned for that performance time.? Thus a 
performance time which marks the 50th percentile or median rating in 
a given occupation may correspond to the 65th percentile level of the 
unselected group on which the general norms are based. 

Figures 1-5 show similar data for a few selected occupations. 


Occupations of Men 


Helpers in Skilled Trades. This group includes all who had worked 
for one month or longer as helper in any skilled trade, such as machinist’s 
helper, plumber’s helper, carpenter’s helper, roofer’s helper, paperhanger’s 
helper, or any other job which, as described by the applicant, would 
seem to fall into such a classification. Workers who had later become 
skilled tradesmen were not included for the reason that such applicants 
were given verbal trade tests rather than manipulative tests. Appren- 
tices were included, but students with training in vocational schools were 
not included unless they had had actual experience as helper to a skilled 
tradesman. 

The median performance of this group on the Spatial Relations test 
(Table 1) was better than that of 65 per cent of unselected applicants, 
according to norms established at the Employment Center. On the 
Kent-Shakow test, half the helpers in skilled trades surpassed 67 per cent 
of the unselected group on simple problems, and 71 per cent on com- 
plicated problems. In Placing and Turning they surpassed 67 and 55 


? For tables and figures, presenting distributions and norms for all tests used, see 
Ibid. 





418 


Norm 


Fie. 1. 


Lorene Teegarden 


Spatial Kent-Shekow Placing Turning Plier 
Rela. Simple Complex Dext. 






. . 2 en CCcuP. 
P.R, N85 Nel35 N-135 N=71 N-71 N=60 “pp 


95 
90 Bis a oe 
Pee 





80 
75 





75 
70 


60 


srt ae 
Sie 
em 





25 





20 


10 





05 


Helpers in skilled trades. Levels on gereral norms attained by 10, 25, 50, 75 
and 90th percentile levels of occupational group. 





Differences in Manipulative Performance 419 


Spatial Kent-Shakow Placing Turning Plier 
Relea. Simple Complex Dext. 


Norm ” * ” - ™ -27q Occup, 
P.R. N=51 N-65 N=65 N=41 N=41 N=38 P.R. 


‘ ae 
ee 


80 


/ 





75 





70 





40 50 


30 





25 
20 


10 


| 


10 





05 


Fig. 2. Male operators of factory machines. Levels on general norms attained by 
10, 25, 50, 75 and 90th percentile levels of occupational group. 





420 


Norm 
P.R. 


95 
90 


80 


75 


70 


60 


40 


30 
25 


20 


10 
05 


Fig, 3, Manual laborers levels on general norms attained by 10, 25, 50, 75 and 90th 


Lorene Teegarden 


Spatial Kent-Shakow Plecing Turning Plier 


Rele. Simple Complex Dext. 
N=106 N*173 N+173 N=97 N=97 N=72 
\ La 





Occup. 
og . 





_—— 














rd are 
LY ‘ 


percentile levels of occupational group. 





Differences in Manipulative Performance 421 





Spatial Kent-Shakow Placing Turning Plier 


Rele. Simple Complex Dext. 
ao N 54 N120 N120 will9 wWillg 4550 oem 
95 
90 NX Z 





Ss 
Re Me 5 ee 


ate, 








40 


30 
25 
20 

7X : 
10 VT — 
os 10 


Fig. 4. Factory workers at hand occupations (Females). Levels on general norms 
attained by 10, 25, 50, 75, and 90th percentile levels of occupational group. 











422 


Norm 
DJ oR . 


95 


90 


75 
70 


60 


50 


25 


10 


05 





Lorene Teegarden 


Spatial Kent-Shakow Placing Turning Plier 
Rela. Simple Complex Dext. 


N32 oNSS NSS %W75 W75 wa “C*UP 











ae 
ee 
a’ 


7 





iY 25 
10 





V 


Fig. 5, Packers and wrappers (Females). Levels on general norms attained by 10, 25, 


50, 75 and 90th percentile levels of occupational group. 





Differences in Manipulative Performance 423 


per cent respectively. Only on Plier Dexterity did the median of the 
occupational group fall as low as that of the unselected group on which 
the test norms were based. Since exact neuromuscular control is not 
required in all the skilled trades it is not surprising that on the Plier 
Dexterity test which measures that quality the occupational median is 
not high. In problem solving, as measured by the Kent-Shakow test, 
only the lowest 25 per cent of the occupation fall in the lower half of the 
unselected group, while in Spatial Relations and in Placing and Turning 
the 25th percentile level of the occupational group surpasses more than 
one third of the unselected group. 

How much faster is the performance of this group than the test 
performance of unselected applicants represented by the test norms? 
Table 2 gives the per cent of time by which a performance at each decile 


Table 1 


Manipulative Performance of Helpers in the Skilled Trades. (Percentile rating on 
general norms corresponding to the 10, 25, 50, 75 and 90th percentile 
levels of the occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





85 135 135 71 71 60 


Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
95 89 89 93 88 93 
86 80 82 82 75 78 
65 67 71 67 55 50 
34 49 49 37 36 29 
10 12 29 19 24 23 17 





point surpasses performance at the next lower decile point. At the 
extremes of the scale the interval used is five percentiles rather than a 
full decile. The median performance of skilled trade helpers on Spatial 
Relations equals percentile 65 on the norms. According to Table 2 
percentile 60 is 4.9 per cent faster than percentile 50, and 70 is 4.7 per 
cent faster than 60. The median performance of the occupational group, 
then, is approximately 7.2 or 7.3 per cent (4.9 plus one half of 4.7) per 
cent faster thio the median of unselected applicants. By a similar 
comparison for simple and complex problems of the Kent-Shakow test, 
we find that the median performance of skilled trade helpers requires 
approximately 14 per cent less time on simple problems and 38 per cent 
on complex problems. On Placing and Turning the occupational median 
is shorter by only three and two per cent respectively; and on Plier 
Dexterity it coincides with the general median. Thus it appears from 








424 Lorene Teegarden 


the test results that the superiority of this group is greatest in ability 
to react to varied details and in ability to solve problems. 

By comparing specific occupational tables with Table 2 for men, or 
with Table 12 for women, such comparisons can be made for each occu- 
pation presented. 

Skilled trade helpers as a group are equalled or surpassed by several 
occupations in rapidity of hand movement (Placing test) or in two-hand 
coordination (Turning), and are surpassed in both these traits and in 
exact movements (Plier Dexterity) as well by the occupational group 
of wrappers and packers (Table 7). But among the occupations studied 
none surpasses them in reaction to a multiplicity of details (Spatial 


Table 2 


Per Cent of Time Gained by Gain of Ten Percentiles in Performance Rating. (Gain 
of five per cent at extremes of range) 
Males 





Per Cent of Time Gained with Gain in P.R. on Test 


Interval Kent-Shakow 
of Gain 
in P.R. Spa. Rel. Simple Complex Placing Turning Pliers 2 











90-95 6.8 12.9 16.1 2.9 3.8 5.8 
80-90 10.1 10.3 20.2 4.2 5.4 8.0 
70-80 6.6 8.8 21.4 2.7 3.5 5.1 
60-70 4.7 8.6 21.3 14 3.4 5.7 
50-60 4.9 8.8 15.3 2.2 23 5.4 
40-50 4.5 8.9 26.5 2.1 3.2 4.8 
30-40 5.1 8.9 20.9 2.5 3.6 6.4 
20-30 7.9 12.1 23.2 2.4 5.3 9.4 
10-20 11.4 15.4 27.8 3.5 6.7 11.3 

5-10 7.2 22.5 20.0 3.4 5.9 8.6 





Relations) or in ability to solve problems requiring resourcefulness and 
ability to modify methods of work (Kent-Shakow). While these three 
tests are the most significant, apparently all the tests in the battery 
measure traits which are of importance for this occupational group. At 
every level on every test the helpers’ group compares satisfactorily with 
or does better than the corresponding level of unselected applicants. 

This group represents a mixture of many kinds of jobs. There is 
need for further research directed toward determination of traits signifi- 
cant for specitic trades which differ in their requirements. 

Truck Drivers and Chauffeurs. A large majority of records in this 
group (Table 3) are those of truck drivers, with a smaller number of 
private chauffeurs and a limited number of taxi or bus drivers. 





Differences in Manipulative Performance 425 


The levels of response to details (Spatial Relations), and of problem 
solving ability (Kent-Shakow test) are above those of the unselected 
group but not so high as they are for helpers in the skilled trades. The 
number of drivers who took the other tests was not large, and the results 
may therefore be less reliable, but they suggest one-hand and two-hand 
ability which is rather similar to that of the unselected group; and plier 
dexterity, or neuromuscular control which, although it is no poorer at 
the lower level, is not so good at the middle and upper levels as the 
standards set by unselected applicants. 

Automobile drivers must respond quickly to details of the scene before 
them. They must use judgment in unexpected situations and must be 
able to respond in a variety of ways according to the needs of the situa- 


Table 3 


Manipulative Performance of Truck Drivers and Chauffeurs. (Percentile rating on 
general norms corresponding to the 10, 25, 50, 75 and 90th percentile levels 
of the occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 78 99 99 40 40 34 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 89 87 86 87 80 91 
75 79 72 76 80 69 58 
50 62 61 63 50 55 38 
25 44 42 42 36 27 24 
10 14 23 15 20 17 15 





tion. These are abilities required by the tests in which the group rates 
high. But the need for two-hand coordination and for exactness of 
movement is not great enough to raise these traits far above the level of 
the general group. 

Truck Helpers and Loaders. Although the number in this group is 
small, nevertheless they present an interesting contrast to the drivers. 
Applicants were listed as helpers or loaders if they did not have charge 
of the truck on the road. Some individuals in the group did loading only. 
Others accompanied the truck, helped load and unload, and in some cases 
assisted with the driving. If a man had had a job as helper and another 
as driver, his record was included in both groups unless one type of job 
was short and the other much longer. In such a case the short job was 
ignored and the record was entered only for the occupation of longer 
experience. 








426 Lorene Teegarden 


As a group, helpers and loaders (Table 4) rate below drivers in most 
tests at the median level, although the best 25 per cent of the group 
compare rather closely with the corresponding portion of the drivers. 
In time required to complete the tests, the median performance of drivers 
surpassed helpers and loaders by 11 per cent on Spatial Relations, by 


Table 4 


Manipulative Performance of Truck Loaders and Helpers. (Percentile rating on general 
norms corresponding to the 10, 25, 50, 75 and 90th percentile levels 














of the occupational group) 
Kent-Shakow 
Spa. Rel. Simple Complex Placing Turning Pliers 2 
No. 32 48 48 26 26 20 
Occup. 

P.R. Per Cent of Unselected Applicants — by Occupational P.R. 
90 94 85 89 79 85 
75 70 69 73 ra 71 50 
50 40 42 53 57 45 37 
25 20 27 18 35 29 17 
10 06 14 06 18 16 10 

Table 5 


Manipulative Performance of Male Hand Operators in Factories. (Percentile rating 
on general norms corresponding to the 10, 25, 50, 75 and 90th percentile 
levels of the occupational group) 





Kent-Shakow 
Spa. Rel. Simple Complex Placing Turning Pliers 2 








No. 67 89 89 67 67 56 
Occup. 

P.R. Per Cent of Unselected Applicants —— by Occupational P.R. 
90 97 86 86 93 91 
75 92 76 79 eo 82 82 
50 74 63 67 65 58 43 
25 38 45 48 37 33 27 
10 15 32 14 23 22 19 





approximately 17 and 16 per cent on simple and complicated problems 
respectively, by only three per cent on Turning, and not at all on accu- 
rate movements. 

Factory Labor, Hand Operations. The group of factory workers at 
hand operations includes probably as wide a variety of jobs as helpers 
in skilled trades. The profile of the group as a whole (Table 4) is rather 
similar to that"of the helpers. 





Differences in Manipulative Performance 427 


Factory Machine Operators. Here are included operators of all kinds 
of factory machines (Table 6). Some records are included in both the 
hand and the machine operators’ groups because the applicants had 
worked at both kinds of jobs. 

The middle half of the group tends to surpass unselected applicants 
except in exact movements. The top records surpass in all except prob- 
lem solving; and the poorest records equal or surpass the general group 
in all tests, including even problem solving. Throughout the range there 
is a tendency to rate better in one-hand than in two-hand operations, 
a tendency which appears also among hand operators and helpers in the 
skilled trades. 

Wrappers and Packers. Besides workers engaged exclusively in 
wrapping and packing, this group includes workers in shipping depart- 


Table 6 


Manipulative Performance of Male Operators of Factory Machines. (Percentile rating 
on general norms corresponding to the 10, 25, 50, 75 and 90th percentile 
levels of the occupational group) 


Kent-Shakow 








Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 51 65 65 4] 41 38 
Occup. 


P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 94 87 85 95 93 93 
75 80 75 78 87 83 62 
50 57 57 61 67 57 40 
25 30 30 34 42 36 25 
10 12 12 09 27 20 16 





ments who described their work as handling goods more than keeping 
records. 

At all levels (Table 7) they surpass unselected applicants in Placing, 
Turning, and Plier Dexterity. At the middle and lower levels they sur- 
pass unselected applicants also in problem solving, but not at the two 
upper levels, the 75th and 90th percentiles. Apparently the minimal 
requirements for this occupation, in certain abilities, are above the 
standards of the lower portion of the unselected population, but the 
maximal requirements are not correspondingly high. 

The profile for wrappers and packers resembles that of machine 
operators rather closely, except that packers rate slightly higher in 
accurate movements. 

Sales Clerks. This group is made up principally of grocery and drug 
store clerks, with a limited number of oil station workers, clerks in men’s 





428 Lorene Teegarden 


furnishings departments or stores, and miscellaneous salesmen for elec- 
trical goods and other merchandise. Most of the jobs included more of 
handling goods than of working on the customer to effect the sale. Their 
work was more manipulative than it was true selling. 


Table 7 


Manipulative Performance of Male Wrappers and Packers. (Percentile rating on general 
norms corresponding to the 10, 25, 50, 75 and 90th percentile levels of the 
occupational group) 


Kent-Shakow 








Spa. Rel. Simple Complex Placing Turning Pliers 2 








No. 49 74 74 45 45 36 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 94 88 88 95 94 93 
75 84 76 75 88 81 80 
50 62 61 60 68 59 53 
25 29 36 33 40) 43 32 
10 07 21 18 21 29 23 

Table 8 


Manipulative Performance of Male Sales Clerks. (Percentile rating on general norms 
corresponding to the 10, 25, 50, 75 and 90th percentile levels of the 
occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 92 164 164 82 82 65 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 95 88 88 92 92 93 
75 86 79 78 82 83 81 
50 63 63 64 61 61 44 
25 36 39 44 39 36 27 
10 12 24 17 17 24 14 





Sales clerks (Table 8) show no peak in any one ability which would 
compare with the peak in one-hand movements shown by machine 
operators. They are distinctly above the level of unselected applicants 
except in exact movements, in which they drop at the middle and lower 
levels. 

Waiters, Kitchen Helpers, Bus Boys and Dish Washers. The number 
of waiters was not sufficiently large to make a separate group. Conse- 








Differences in Manipulative Performance 429 


quently they were combined with other restaurant workers in a single 
occupational group (Table 9). 

In this occupation the middle 50 per cent rate equally high in both 
Placing and Turning. The same is true of sales clerks. But restaurant 
workers do not rate quite as high in these tests as do sales clerks, and 
they drop much lower also in accuracy of movement. 


Table 9 
Manipulative Performance of Waiters, Kitchen Helpers, Bus Boys. (Percentile rating 
on general norms corresponding to the 10, 25, 50, 75 and 90th percentile 
levels of the occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 83 87 87 46 46 37 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 88 82 84 88 91 87 
75 70 71 69 77 76 57 
50 54 54 53 57 57 32 
25 32 28 20 31 31 21 
10 15 15 10 21 16 13 





Table 10 
Manipulative Performance of Manual Laborers. (Percentile rating on general norms 
corresponding to the 10, 25, 50, 75 and 90th percentile levels of the 
occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 106 173 173 97 97 72 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 92 82 82 91 S4 83 
75 76 70 74 67 70 57 
50 49 53 60 51 47 39 
25 19 32 27 29 29 21 
10 07 17 11 17 12 12 





Manual Laborers. Applicants included in this group had done man- 
ual labor for private employers, on WPA projects, or in CCC camps. 
In general this group tends to rate somewhat lower than any other group 
studied (Table 10). The lowest quarter was not so low, however, nor 
the highest quarter as high as the corresponding portions of the group 
who had no work experience, data for which are omitted from this report. 





430 Lorene Teegarden 


Occupations of Women 


Factory Workers at Hand Operations. This group includes some who 
had had experience as inspectors or testers as well as at ordinary factory 
jobs. Assemblers, inspectors and testers were also recorded separately 
for comparison with the entire group, and they will be discussed later. 

In this occupational group (Table 11) workers at the 75th percentile 
level and above do equally well in one-hand and two-hand operations 
and in reaction to details. At the median and lower levels their ratings 
decrease more rapidly in reaction to details than in two-hand and slightly 
more rapidly in two-hand than in one-hand movements, and in none of 
these does it drop below the unselected group. In exact muscular con- 


Table 11 
Manipulative Performance of Women Factory Workers at Hand Occupations. 
(Percentile rating on general norms corresponding to the 10, 25, 50, 75 
and 90th percentile levels of occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 54 120 120 119 119 50 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 92 85 83 91 90 80 
75 82 69 75 83 83 64 
50 55 54 61 66 62 33 
25 26 30 34 39 32 16 
10 13 16 08 22 10 06 





trol (Plier Dexterity), however, they drop below unselected applicants 
at all levels. 

Assemblers, Inspectors, Testers. It was noted that applicants claim- 
ing to have experience in inspecting, testing, or assembly work seemed to 
do well at the tests. Since most of them had also been employed as 
ordinary operators they were included in that group which has just been 
discussed. If they had worked one month or longer as inspectors, testers 
or assemblers they were included also in this group. Although the 
numbers are not large, the profile is so different from that of the operators 
that the group is presented separately. 

As compared with operators, this occupational group (Table 13) is 
only slightly faster in Placing and Turning (two to seven per cent differ- 
ence in performance time at the middle and lower levels), in simple 
problems (four to eight per cent at the lower levels), and in Spatial 
Relations (ten per cent or less.at the median and lower levels). In their 








Differences in Manipulative Performance 431 


Table 12 


Per Cent of Time Gained by Gain of Ten Percentiles in Performance Rating. (Gain 
of Five percentiles at extreme range) 
Females 





Per Cent of Time Gained with Gain in P.R. on Test 
Kent-Shakow 








Spa. Rel. Simple Complex Placing Turning Pliers 2 





5.5 12.2 15.2 2.8 3.5 4.8 

7.9 10.8 21.7 3.6 6.5 6.1 

6.5 9.8 20.9 2.2 3.4 5.8 

5.6 8.9 20.4 2.6 2.8 3.1 

5.5 8.2 20.6 2.1 3.2 3.9 

6.1 9.3 29.6 2.1 3.1 5.1 

6.5 10.7 12.0 3.2 4.5 6.5 

20-30 6.1 10.4 20.8 3.4 5.1 7.6 
10-20 11.1 17.2 23.3 3.7 6.1 11.4 
5-10 7.2 20.5 26.1 3.9 6.2 9.9 





time records at complicated problems of the Kent-Shakow test, however, 
and at Plier Dexterity, they surpass the operators by ten per cent or 
more throughout practically the entire range; and the poorest ten per 
cent of this group exceed the corresponding level of operators by almost 


40 per cent at complex problems and by more than 20 per cent at the 
pliers. This is not surprising in view of the nature of their jobs, and the 
fact that many of them had been engaged in assembling small and 
complicated parts. 

A separate group of forty hand operators were tested in the factory 
where they were employed. Thirty of them worked at operations re- 


Table 13 


Manipulative Performance of Women Assemblers, Inspectors and Testers. (Percentile 
ratings on general norms corresponding to the 10, 25, 50, 75 and 90th 
percentile levels of occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 22 50 50 45 45 16 
Occup. 
P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 94 90 92 91 
75 85 78 86 84 
50 72 67 76 72 
25 35 42 51 50 
10 12 22 28 22 











432 Lorene Teegarden 


quiring exact movements. The median performance of these forty 
operators on the Plier Dexterity test was better than 61 per cent of 
unselected applicants. This compares closely with 63, which was the 
median percentile rating of the 16 assemblers, inspectors and testers 
who took this test as applicants at the Employment Center. Their 
median rating on Spatial Relations was 72, which coincides with the 
median rating of assemblers at the Center. Their ratings on Placing 
and Turning were based on different norms and therefore cannot be 
compared; and they were not given the Kent-Shakow test for the reason 
that they were engaged in highly routine operations only. 

Operators of Factory Machines. These include operators of all kinds 
of factory machines. Many were included also among the hand oper- 
ators since they had worked at both kinds of jobs. 


Table 14 


Manipulative Performance of Women Operators of Factory Machines. (Percentile 
ratings on general norms corresponding to 10, 25, 50, 75 and 90th percentile 
levels of the occupational group) 





Kent-Shakow 
Spa. Rel. Simple Complex Placing Turning Pliers 2 








No. 20 56 56 63 53 18 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 93 84 86 91 92 62 
75 80 74 77 84 81 45 
50 53 51 62 69 66 30 
25 27 32 37 52 40 14 
10 10 15 13 28 21 06 





In the Spatial Relations and the Kent-Shakow tests the group 
(Table 14) is quite similar to hand operators. In Placing and Turning 
they are similar at the median and upper levels; and although at the 
25th percentile level machine operators are 12 and 8 points respectively 
above hand operators on these two tests, the difference in actual per- 
formance time is less than five per cent. In accuracy of movement, 
however (Plier Dexterity), the upper levels of machine operators are 
approximately 20 percentiles above and require about ten per cent less 
time than do hand operators of the corresponding level in their occu- 
pational group. 

Packers and Wrappers. In most of the occupations studied, only the 
middle and upper levels, and in some only the upper levels, rate equally 
high in both Placing and Turning. But here is an occupational group 





Differences in Manipulative Performance 433 


(Table 15) in which workers at the lower levels also rate as high in two- 
hand coordination as in one-hand operations. This was true of male 
restaurant workers and of male sales clerks, and we shall see that it is 
true also of female domestic workers. 

The median performance time of women packers and wrappers is 
quicker than that of unselected applicants at Placing by four per cent 
and at Turning by eight per cent. Their margin at lower levels is 
similar to this. In reaction to details, as measured by the Spatial Rela- 
tions test, the group surpasses the performance time of unselected 
applicants by from 5 to 12 per cent at different levels. In solving 
complicated problems their median performance exceeds the general 
norms by 12 per cent. 


Table 15 
Manipulative Performance of Women Packers and Wrappers. (Percentile ratings on 
general norms corresponding to the 10, 25, 50, 75 and 90th percentile 
levels of the occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 32 83 83 75 5 26 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 90 85 86 93 Of 88 
75 82 69 75 85 87 75 
50 67 53 57 71 73 36 
25 37 32 27 48 47 24 
10 14 22 08 31 27 18 





Sales Clerks. These included many workers in dime stores and con- 
fectioneries, and some sales girls in groceries and department stores. 
As a group (Table 16) they make their best showing in Placing, but the 
upper levels do just as well on Turning and on Spatial Relations. The 
performance on Spatial Relations, however, declines rapidly so that at 
the median and lower levels it is below the standards of the unselected 
group. In accurate movements (Plier Dexterity) the sales group is 
similar to factory hand operators, and rather better than factory machine 
operators. In this test the group is like wrappers and packers at the 
middle and lower levels, but not so fast as they are at the upper levels. 

Waitresses. This group is limited to waitresses. It does not include 
bus girls or any workers in the kitchen or pantry. 

The upper levels of waitresses are similar in Placing and Turning to 
the upper levels of packers, factory hand and machine workers and 











434 Lorene Teegarden 


Table 16 


Manipulative Performance of Women Sales Clerks. (Percentile ratings on general 
norms corresponding to the 10, 25, 50, 75 and 90th percentile levels of 
occupational group) 


Kent-Shakow 
Spa. Rel. Simple Complex Placing Turning Pliers 2 











No. 54 86 86 61 61 29 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 96 86 87 92 95 84 
75 85 72 77 82 80 59 
50 45 56 62 66 59 36 
25 15 37 37 47 43 23 
10 05 23 16 25 18 07 





sales clerks. At the median and lower levels, however, the waitresses 
decrease in ratings more rapidly than do any of these other occupations, 
although the differences appear greater in terms of percentiles than in 
terms of actual performance time. In Plier Dexterity they are similar 
to all these groups except that packers and wrappers surpass them at 
certain levels. In reaction to details (Spatial Relations) they are sur- 
passed by packers, resemble somewhat the groups of factory hand and 
machine workers, and in turn surpass sales clerks at lower levels but are 
surpassed by the clerks at the higher levels. The middle fifty per cent 
of the waitress group rate low in accurate movements as compared with 
the unselected group. But the upper ten per cent are about as good and 
the lower ten per cent are but little poorer than the corresponding 


Table 17 


Manipulative Performance of Women Waitresses. (Percentile ratings on general norms 
corresponding to the 10, 25, 50, 75 and 90th percentile levels of 
the occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





10 09 20 11 16 19 


No. 46 131 131 62 62 23 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 93 81 82 89 90 88 
75 71 66 73 83 78 51 
50 53 46 55 59 49 35 
25 26 29 29 37 26 20 

08 








Differences in Manipulative Performance 435 


portions of the general group. In solving problems waitresses do less 
well than sales clerks and machine operators, but are similar to packers 
and hand operators. 

Domestic Workers. This group includes domestic workers of all types: 
general maids, cooks, laundresses, parlor maids and children’s nurses. 
Most of them had been engaged in general housework rather than in any 
special type of domestic work. As a group (Table 18) they rate relatively 
high in reaction to details and in Placing and Turning, and relatively 
low in accuracy of movements as compared with unselected applicants. 
In problem solving, the upper levels of the group drop below the corre- 
sponding levels of unselected applicants. As compared with waitresses, 
domestic workers rate somewhat better in reaction to details and in 
two-hand movements. 


Table 18 


Manipulative Performance of Women Domestic Workers. (Percentile ratings on general 
norms corresponding to the 10, 25, 50, 75 and 90th percentile levels of the 
occupational group) 





Kent-Shakow 





Spa. Rel. Simple Complex Placing Turning Pliers 2 





No. 86 212 212 123 123 58 
Occup. 

P.R. Per Cent of Unselected Applicants Surpassed by Occupational P.R. 
90 93 77 79 93 89 82 
75 82 65 70 81 79 62 
50 55 46 53 60 60 29 
25 29 26 24 40 42 12 
10 14 13 09 17 19 05 





Discussion 


Despite the inclusion in each occupational group of both efficient and 
inefficient workers, differences have appeared between certain occupa- 
tions of men and women. There is nothing in this study to indicate 
whether these differences would be greater or less if only efficient workers 
were included in each group. A study was made also of groups of longer 
and shorter experience. Results of this study, which are not included 
in this report, suggest that certain occupational profiles might be changed 
if inefficient workers were eliminated. 

The differences found between occupations have been greater in 
ability to solve problems, in accuracy of movements, and in ability to 
react to a multiplicity of details, than they have been in rapidity of hand 
movement or coordination of two-hand movements. Although the 





436 Lorene Teegarden 


Placing and Turning tests do not differentiate between occupations, they 
remain useful to identify and eliminate from consideration for certain 
manipulative occupations those applicants whose performance falls below 
whatever may be selected as the critical score or level for a specific job 
or occupation. 

Testing for specific jobs, as for operators of specific machines, would 
reveal the test profile characteristic of workers on each job. Such pro- 
files might differ widely from the profile for machine operators shown in 
this study, which includes operators of many kinds of machines.’ 


Selection of a Test Battery 


The tests to be used as an aid in selecting workers for a variety of 
jobs will depend upon the jobs themselves. 

1. Analysis of each job will reveal the types of operations involved 
and the qualifications required of each operator: reaction to simple or 
complicated details, rapidity and accuracy of one-hand or two-hand 
movements, eye-hand coordination or hand-foot or two-hand coordina- 
tion, keenness of vision, bodily strength, resourcefulness in solving prob- 
lems or adaptability in varying procedures, facility in arithmetic, spelling, 
language, or other skills. Job analysis furnishes the basis for selection of 
the tests to be used. 

2. A battery of manipulative tests may be selected or developed to 
include parts suited to all manipulative requirements; a visual test for 
visual acuity or discrimination; a physical test for requirements of 
stature, weight, muscular strength or other physical qualities; and a 
battery of tests for clerical or technical skills required for the jobs for 
which tests are to be used. 

3. An objective criterion for rating efficiency on the job is required 
in order to afford a basis for comparison of test records with production 
records. This should be the most reliable and objective measure of 
efficiency which is available. 

4. A statistical study of test records in relation to production records 
will reveal the proper weighting to be assigned the separate tests for 
prediction of performance on different jobs. 

5. In cases where test and production records disagree, the personnel 
department, by studying the individual situation, may discover emo- 
tional or other factors affecting the worker’s efficiency. 


2A group of 40 female hand operators tested in one factory by the writer had a 
median of 61 percentile on Plier Dexterity, as compared with 33 percentile which is the 
median of the miscellaneous female hand operators presented in this study. Of the 40 
operators referred to, 30 had jobs requiring delicate movements. 





CO 


Differences in Manipulative Performance 


What Causes Efficiency? 


If experienced workers are found to differ appreciably from inexperi- 
enced workers on a job, is the gain or loss in certain qualities tested due 
to the elimination of those who work at different rates of speed, or is it 
due to practice on the job? Are the traits which are tested true apti- 
tudes or are they skills which may be acquired by practice or improved 
by persistent use? 

Nothing in the present study can suggest the answer to this question. 
But systematic studies by personnel departments in industry or by 
vocational and trade schools would indicate the answer. 

If the traits are aptitudes showing wide individual differences, indus- 
try and trade schools cannot well afford to overlook them or to neglect 
measuring them in the process of selecting workers or students. If they 
are skills which can be trained, then it is important that new workers 
and students should be inducted into their work through a course of 
training which includes practice directed toward perfection of the skills 
or traits which are basic or essential to the job or trade. 











Where They Like to Work; Work Place Preference of 
228 Electrical Workers in Terms of Music 


W. A. Kerr 
RCA Victor Division, Radio Corporation of America 


Two studies (1, 5) indicate that the majority of employees who have 
experienced music in the factory are strongly in favor of it. While it is 
well known that workers will elect jobs in those plants and industries 
which provide better pay or working conditions (4), no study has yet 
been made of many of the specific work-place characteristics which may 
also play an important part. It is the purpose of this study to inquire 
into the probable effect of music in the factory as an “attraction” or 
inducement to the worker with music to remain at his job and as an 
inducement for the worker without music to move to a job location where 
music is played. 

In order to achieve this purpose, two groups of workers were selected, 
one group having the status quo condition of daily broadcasts of recorded 
music over a plant communication system and the other having the 
status quo condition of no music. Since the Gallup, Roper, and similar 
polls have established the fact that the behavior of a group may be 
predicted closely by anonymous measurement of the expressed opinions 
of representative individuals composing the group, this study attempts 
to measure the music or no-music work-place preferences of 228 electrical 
workers, leaving to the reader the implications, if any, for labor turnover 
and mobility. 

The subjects of this study were 204 radio tube workers who listen 
regularly to industrial music and 24 electrical assembly workers who do 
not hear industrial music. Both groups are of similar composition and 
appear to be of average socio-economic status. Ninety-seven per cent 
of these 228 native-born white employees are female and the age range 
for the total group is from 18 to 47 with an average age of 23.2. 

Item 2 in the form which is shown as Figure 1 was included as a sort 
of equivalent of Item 1 in order to determine the reliability of measure- 
ment. Age and sex are the only other information requested. The 
tear-with-your-fingers technique, an original innovation, was introduced 
to eliminate the need for distributing pencils. 

Inspection of Table 1 reveals that the responses are not distributed 
randomly for either Item 1 or Item 2. Chance expectancy for each 
438 


Where They Like to Work 439 


Please answer each question by Tearing the Paper with Your Fingers 





Do not sign your name. This is sim- 

Work Preference Ballot ply part of a research study. Your 

answers will in no way influence your 

present or future status here. 

. If you are given a choice of working in either of two different departments 
in this plant on identical jobs which carry equal pay, responsibility, and 
prestige, but which differ in that some recorded music is played in one depart- 
ment while employees work and music is not played in the other department, 


which department would you prefer to work in? (TEAR ONE) — 














A. The department with music 








B. Either department; it wouldn’t matter | 


C. The department without music 





. While working in an industrial plant, how much of the time would you like 
to hear music? 





. None of the time 





. Rarely 





. Occasionally 





. Frequently 
. All of the time 

















Female 





4. Your age: (TEAR nearest 





40 | 45 | 50 | 55 | 60 


























Fig. 1. Questionnaire form for securing data on work place preference. 


response to Item 1 is 33 per cent and for each response to Item 2 is 
20 per cent. In order to check the statistical significance of differences 
between the number preferring to work in a department with music and 
the number wanting to work in a department without music, the standard 
errors of the differences were computed as shown in Table 2. The critical 
ratios and probabilities in this table show rather conclusively that these 
two groups of electrical workers, other factors being equal, would tend, 
too often to be accounted for by chance, to elect jobs in a department 
where music is being played. The formula used in obtaining oA and oC 
is the following (2). 


oB = vNpq 














440 W. A. Kerr 


Reliability of these measurements is satisfactory for the type of group 
analysis applied, as is shown by a correlation between “equivalent” items 
one and two of .48 + .04 for the tubes group, .68 + .07 for the electrical 
assembly group, and .52 + .04 for the combined total of 228 employees. 
Actually these coefficients probably underestimate the reliabilities be- 
cause Item 2 is not exactly equivalent to Item 1. 

The distribution of responses to Item 2, as shown in Table 1, indicates 
that the modal tendency is a preference to hear industrial music ‘“‘fre- 


Table 1 
Percentage of Each Group Giving Each Response to The Two Work Preference Items 





Tubes Assembly All 
Response N =204 N=24 N =228 








A. Department with music......... 73.8 34.6 69.3 

B. Either department............. 25.3 65.4 29.9 

C. Department without music...... 00.9 00.0 00.8 

100.0 100.0 100.0 

2. A. None of the time.............. 00.4 00.0 00.4 
Se Ms arsceh uel: anpimdenee an 00.4 00.0 00.4 
a eye 34.5 53.8 36.8 

By: I sks. ehh adedandates 49.5 42.4 48.7 

B. BBE GO Ges... . evcccccevcss 15.2 03.8 13.7 







100.0 





Reliability of the Difference Between Numbers Preferring the Department With Music 
(A) and Numbers Preferring the Department Without Music (C) 


Group N A 





Cc oA oC Diff. caittz. |C.R. Probability 
Tubes...... 204 152 2 6.20 1.34 150 6.42 23.36 99.8 

0 

2 





Assembly... 24 9 2.34 -00 9 2.34 3.85 99.8 
6.90 1.34 159 7.09 22.43 99.8 





quently.” Less than 1 per cent state that they never want to hear it, 
and 99 per cent want to hear it occasionally, frequently, or all the time. 
The overwhelming majority of the workers in the electrical assembly 
department, which does not have music, prefer to hear music while they 
work. Employees in the tubes department, which does have music, 
strongly affirm their desire for industrial music. 

For the total group of electrical workers, age is negatively correlated 
(see Table 3) with preference for the department with music and female 
sex is positively correlated with such preference. All of the correlations 








tt = o— BN 


Where They Like to Work 441 


in this table are statistically significant. It should be noted, however, 
that the two sexes differ in several other variables; the relatively few 
males all occupy supervisory positions, do different types of work, repre- 
sent higher socio-economic and educational backgrounds, and are of a 
higher average age than the females. As indicated in Table 3, when sex 
is held constant by partial correlation, the coefficient of correlation be- 
tween age and. music work place preference declines to —.14. 

While the correlation between sex and music work place preference 
is reduced to .19 by holding age constant, the importance of this correla- 


Table 3 


Correlations Between Music Work Place Preference and Certain Other Variables 











Variable: 1 2 3 
1. Preference for music —.23 .26 
2. Age —.41 
3. Sex 

Ti23 = —.14 Ti3.2 = .19 





tion is further reduced by the fact that educational background and type 
of work are different for these sex samples. 


Conclusions 


1. The great majority of 24 electrical assembly workers and 204 tubes 
workers prefer to work in a department with music as compared with a 
department without music. 

2. Modal tendency is for the tubes workers to desire music “fre- 
quently” and for the electrical assembly workers to desire it “occasion- 
ally.” 

3. A low negative correlation exists between age and preference for 
the department with music. 

4. A tendency is found in these data for females to be more favorable 
than males toward the department with music, but it is believed that 
this sex difference is largely accounted for by a selective factor which 
operated. 

5. To the extent that these employees are similar to other industrial 
workers, this study might suggest that, when other factors are equal, 
workers will go to job locations where they can hear music while they 
work or stay at their present jobs provided they have music. A sug- 
gested basis for these implications is that the normal personality usually 
seeks the pleasant and avoids the unpleasant; industrial music appar- 
ently is held to be pleasant by these workers. At a time when workers 











442 W.A. Kerr 


are hard to locate and to hold, these findings may have meaning for em- 
ployment managers and students of the labor market. 

6. A new “tear with your fingers” polling method has been devised 
and used with satisfactory reliability. This method saves time in meas- 
urement of opinion in factories. 


References 


1. Freyman, Richard. Study with listener research department of British Broadcasting 
Company in various departments of a radio factory. London: British Broad- 
casting Co., 1942. 

2. Guilford, J. P. Psychometric methods. New York: McGraw-Hill Book Co., Inc., 
1936, pp. 76-78. 

3. Kerr, W. A. Psychological effects of music as reported by 162 defense trainees. 
Psychol. Record, 1942, 5, 205-212. 

4. Lester, R.A. Economics of labor. New York: Macmillan Co., 1941, pp. 913. 

5. Humes, J. F. The effects of occupational music on scrappage in the manufacture 
of radio tubes. J. appl. Psychol., 1941, 25, 573-587. 





The Seashore Measures of Musical Talent and Speech Skill 
Howard Gilkinson 


University of Minnesota 


The relationship of speech and hearing has long been a subject of 
interest to psychologists and teachers of speech. The discussions of this 
topic which have appeared in speech textbooks during the last decade 
indicate universal acceptance of the idea that normal speech depends 
upon normal hearing. This belief is supported by observations of de- 
ficiencies and peculiarities of vocal performance among the congenitally 
deaf and the deterioration of speech which sometimes accompanies a loss 
of hearing. The audiometer is the instrument usually employed in 
measuring hearing deficiencies. 

There is apparently a further belief held by some authors that the 
aesthetic characteristics of an individual’s speech, i.e., the expressive 
variation of pitch, force, quality, and rate depend, to an important 
extent upon his ability to discriminate between small differences in 
sound. In this connection the Seashore Measures of Musical Talent are 
sometimes recommended as useful instruments for discovering the indi- 
vidual’s ability to make those auditory discriminations upon which an 
expressive vocal pattern is said to depend. In making this recommenda- 
tion one author (2) says that dull hearing is usually due to lack of 
training, and that improved habits of sound discrimination can be ac- 
quired. Seashore (3) has stated that ear-mindedness has much the same 
function in speech that it has in music. The idea that speech and 
musical performance have some common underlying factors receives 
support from the fact that ability to carry a tune ‘‘very well” is more fre- 
quently claimed by superior speakers than by inferior speakers (1). 

We are interested at the moment not so much in the general idea that 
hearing is in some way indispensable to normal speech, which is commonly 
accepted, but in the more specific proposition that expressive vocal per- 
formance depends upon certain types of auditory ability which are 
measured by the Seashore tests. If the latter thesis is correct, it seems 
logical to suppose that deficiencies in vocal performance should be found 
more frequently among those individuals who score low on the auditory 
tests than among those who achieve high scores. For example, one 
might expect to find a larger number of monotonous voices among the 
former than among the latter. Such differential trends should emerge 

443 





444 Howard Gilkinson 


even though factors other than auditory ability operate as determinants 
of speech skill, provided these other factors are randomly distributed 
between the criterion groups. 

During the fall quarter of the school year 1940-41, the revised Sea- 
shore battery was given to 377 students in the beginning speech course 
at the University of Minnesota. Series A was employed, the subjects 
being tested in groups from ten to seventy-five in number. Early in 
the quarter these subjects gave extemporaneous talks and each was rated 
by from ten to twelve classmates for excellence of vocal performance in 
relation to quality, pitch, force, and rate. Committees composed of two 
teachers and an advanced speech student heard 148 of these subjects and 
noted the faults and qualities of each speaker’s voice on a check list of 
descriptive terms. The ratings by student judges yielded reliability 


Table 1 


Comparison of Speech Ratings for High and Low Scoring Groups on Each 
Test by the Method of Critical Ratios * 





Seashore Measures of Musical Talent 








Speech Ratings Pitch Loud. Rhyt. Time Timb. Ton. Mem. 
Quality.......... 3.36 1.27 1.29 1.21 2.43 1.86 
PR 2.72 43 1.08 21 .96 2.21 
Ptah areevahan 3.42 .64 71 38 75 2.21 
Ss Zak coh ye ees 3.83 77 1.18 1.77 1.54 2.42 
Average......... 4.20 69 1.18 1.08 1.73 2.67 





* Difference/Sigma of Difference. 


coefficients ranging from .69 to .82, using the split-half method and 
correcting the coefficients for twice the number of judges. 

The 377 subjects who were rated for speech performance by their 
classmates showed wide individual differences on the six Seashore Meas- 
ures, approximating the entire range of scores indicated in the published 
norms (4). It was possible therefore to designate the one hundred 
highest scoring subjects on each Measure as a high discrimination group 
and the one hundred lowest scoring subjects as a low discrimination 
group, and to compare these groups for excellence of vocal performance. 
There were fifty men and fifty women in each criterion group. The 
critical ratios resulting from the comparison of the averaged speech 
ratings of these groups are given in Table 1. On all six tests, the high 
scoring group rated higher on speech performance than did the low 
scoring group, although it is only in relation to the pitch discrimination 
scores that the critical ratios reach or exceed 3.00. 

The 148 subjects, who were heard by judging committees of teachers 





Musical Talent and Speech Skill 445 


and advanced speech students, also yielded wide ranges of scores on the 
Seashore measures. It was possible, therefore, to compare the extremes 
of the distributions of Seashore scores for possible differentiation in 
speech performance. In this case interest centers on the analysis of 
particular types of auditory ability in relation to particular classes of 
descriptive terms. The results are given in Table 2. 


Table 2 


Distribution of Voice Characteristics as Observed by Teachers Among 50 Low and 
50 High Scoring Students on Three Seashore Measures of Musical Talent 





Pitch Discrimination L H Timbre Discrimination H 





12 nasal twang 13 

too high 10 nasal whine 16 
narrow pitch span 36 21 
monotonous inflection 29 13 
27 5 18 

24 i 10 

22 i 14 

7 15 


H 32 

20 

2 15 

21 12 
28 
34 
22 
21 





There were eight terms or phrases in the check-list used by the judg- 
ing committees which referred specifically to the operation of the pitch 
factor in vocal performance. The table shows the distribution of these 
faults and qualities of voice, as noted by one or more of the three judges, 
between the fifty speakers who scored highest on the Seashore test of 
pitch discrimination (H) and the fifty speakers receiving the lowest 
scores (L). For example, 12 voices among the former were described as 
being ‘‘too low’’ as compared with 14 among the latter. The distribution 
of six descriptive terms contained in the check-list which referred spe- 
cifically to vocal intensity between the fifty students who scored high 
and the fifty scoring low on the Seashore test of intensity discrimination 
is given. There were fifteen terms in the check-list referring specifically 
to voice quality, and the distribution of these descriptions is given as 
between the fifty students who scored high on the Seashore test for 
timbre discrimination and the fifty students who scored poorly. There 





7 





446 Howard Gilkinson 


were twenty-five men and twenty-five women in each high scoring and 
in each low scoring group. 

One is impressed in examining Table 2 with the striking similarity 
with which the judged vocal characteristics are distributed between those 
students who did well on the Seashore Measures of Musical Talent and 
those who did poorly. Such differences as appear are well within the 
range of chance fluctuation. Furthermore, these differences show no 
discernible pattern. 

Other studies have been made to discover relationships between the 
Seashore scores and speech. Weaver (6) gave the musical aptitude tests 
to two hundred and ten male college students after they had been rated 
for excellence of speech by their classmates. He found correlations 
between vocal expression and the various auditory discrimination scores 
ranging from .18 for sense of intensity to .48 for sense of pitch. Travis 
and Davis (5) found that a group of superior speakers scored higher on 
the Seashore tests for pitch, intensity, and tonal memory than did a 
group of inferior speakers. These two previous studies and that part 
of the present investigation involving speech ratings by laymen (Table 1) 
all yield evidence of a slight though positive relationship between some 
of the Seashore scores and speech skill, the test for pitch discrimination 
being the most definite and consistent. In the Minnesota study, how- 
ever, the correlation between the pitch discrimination scores and the 
average speech ratings for 377 subjects was only +.17 + .03. 

It should be noted that in the three sets of data yielding positive 
evidence of relationship, the criteria of speech excellence were quite 
general. Weaver’s subjects were ranked according to over-all judgments 
of the “effectiveness of vocal rendition.” The subject groups in study 
carried out by Travis and Davis were differentiated on the basis of 
general ability to speak. In the present study the students were asked 
to rate each other specifically on quality, pitch, force, and rate. How- 
ever, the inter-correlations among these four factors were found to be 
high, indicating strong influence of halo effect and suggesting that the 
ratings made by these student judges were based largely on their impres- 
sions of the general excellence of the speaker’s performance. It seems 
then, in summarizing the outcomes of the three studies, that there are 
slight relationships between over-all or general impressions of the ex- 
cellence of speech performance and Seashore test scores. This holds 
quite certainly for pitch discrimination, and probably also for tonal 
memory. 

On the other hand, the attempt to relate particular test scores to 
specific vocal characteristics in the present study yields no positive 
results. The judges, particularly the teachers, were experienced in ob- 








Musical Talent and Speech Skill 447 


serving speech behavior. It would be difficult to determine how well 
they agreed with each other, but when the analysis was carried out in 
terms of specific traits noted by two or more judges, the results were still 
negative. 


Summary 


Taking the present study and two previous studies into consideration, 
the following conclusions are indicated: 

1. There is a low order of relationship between Seashore scores and 
speech skill when general criteria of speech skill are employed. 

2. The attempt in the present study to relate scores on various Sea- 
shore tests to specific vocal habits produced no positive results. 

It seems to this writer that, for the time being at least, we must say 
that we can attach no precise meaning to the Seashore tests in respect 
to their relationship to speech skill, and that their practical diagnostic 
value in general speech courses is very questionable. This conclusion 
obviously applies only to the use of the group measures (Form A), and, 
furthermore, has no bearing upon the value of the tests in the field of 
music, for which they were originally intended. 


References 


. Gilkinson, H., and Knower, F. H. Psychological studies of individual differences 
among students of speech. Speech Department Mimeographed Monograph, Uni- 
versity of Minnesota, 1939. Pp. 196. 

. Murray, E. The speech personality. Philadelphia: J. B. Lippineott Company, 1937. 

. Seashore, C. E. Psychology of music. New York: McGraw-Hill Book Company, 
1938. 

. Seashore, C. E., Lewis, D., and Saetveit,J.G. Manual of instructions and interpreta- 
tions for the Seashore Measures of Musical Talent. RCA Manufacturing Company, 
Camden, New Jersey, 1939. 

. Travis, L., and Davis, M. The relation between faulty speech and lack of certain 
musical talents. Psychol. Monogr., 1926, 36, 71-81. 

. Weaver, A. Experimental studies in vocal expressiveness. Quart. J. Speech Educ., 
1924, 10, 199-204. 





Prediction of College Scholarship for Groups Having 
Effort Indices of Restricted Range 


Bert R. Sappenfield 
Montana State University 


Previous studies of the relationship between aptitude test perform- 
ance and college scholarship have generally yielded low relationships, 
expressed by correlation coefficients which are rarely higher than 0.50. 
Correlations of this small magnitude have been interpreted as grounds 
for discouraging further attempts to predict scholarship on the mere basis 
of aptitude scores. Wide individual differences in academic motiva- 
tion have frequently been mentioned as a reason for the generally low 
correlations. 

Problem 


An indirect measure of motivational factors in the school situation 
is the achievement quotient (A.Q.), which may be defined as the ratio 
of achievement age to mental age. It may be inferred that an individual 
with an A.Q. above 100 is one who has been motivated to work harder 
or longer than the average individual of equivalent aptitude, and that 
individuals with A.Q.’s approximating 100 have been expressing an 
average degree of academic drive. It is reasonable to expect that, out 
of a total group of students, one should be able to select a subgroup 
whose scholarship would be highly correlated with aptitude, by using 
as the principle of selection a relatively homogeneous range of A.Q. 
values. Such selection would be equivalent to holding the degree of 
motivation fairly constant. While aptitude scores and school marks are 
not expressed in terms of mental age and achievement age, it is possible 
to derive a satisfactory achievement quotient if the scores and marks are 
first translated into standard scores. 


Method 


The data of the present study included the four-year average of high 
school marks, the first-year average of college marks, and the score on 
the N. Y. U. College Aptitude Examination for each of 196 male fresh- 
man. students in the University College and College of Engineering at 
New York University in the academic year 1937-38. 

High school and college marks were reported in numerical terms. 
The N. Y. U. College Aptitude Examination was administered, prior to 

448 





Prediction of College Scholarship 449 


matriculation, as a three-hour test, consisting of 150 items. Means and 
SD’s of the distributions of marks and aptitude scores are presented in 
Table 1. 

High school averages, college averages, and aptitude scores were 
Sp + 80. This 
formula yields’ standard scores in a distribution having a mean of 50 
and a standard deviation of 10. 

High school achievement quotients were computed by dividing each 
student’s high school average by his aptitude score, both divisor and 


Table 1 


Means, Standard Deviations, and Intercorrelations for High School Marks, 
College Marks, and Aptitude Scores * 


translated into standard scores by the formula 





High 3. o. 
School College Aptitude 
Marks ores 





76.04 84.68 
9.67 19.02 


r with aptitude score j .369 -- 


r with college marks ‘ = — 





*N = 196. High school marks are four-year averages. College marks are first- 
year averages. 


dividend being expressed as standard scores. Achievement quotients 
derived in this manner may be termed high school Effort Indices (E.I.), 
to distinguish them from A.Q.’s derived from the usual age-scale scores 
which are employed in the elementary school. 


Results 


The correlations of aptitude scores with both the high school averages 
and the college averages for the total group were fairly low (.362 and 
.369), as shown in Table 1. The relationship between high school marks 
and college marks was much higher (.615). 

In Table 2 (Column 4) are presented the correlation coefficients be- 
tween aptitude scores and average first-year college marks, for various 
subgroups composed of individuals in restricted Effort Index ranges. 
It is evident from the data in Rows 2, 5, 7, 8, and 9 that as the range of 
Effort Indices was increased, the correlation between college marks and 
aptitude scores decreased markedly. Stated differently, the more nearly 
homogeneous the group was with respect to Effort Indices, the higher 
was the relationship between aptitude and college performance. 





—————————————— 






450 Bert R. Sappenfield 


If the practical problem were to select students on the basis of data 
obtainable at the time when the students applied for admission, the data 
of Rows 1 to 6, Table 2, indicate that about 60 to 80 per cent of students 
(those having high school E.I.’s above 80) could be expected to show 
college performance which was fairly well adjusted to their relative 
aptitudes. Students with high school E.I.’s below 80, evidently, would 
perform in a manner not readily predictable on the basis of their relative 
aptitudes. It would seem, then, that, if the accuracy of predicting 
college scholarship on the basis of high school marks and aptitude scores 
is to be improved, it might be well first to exclude the cases with low 


Table 2 


Correlations between College Aptitude Scores and College First-Year Marks 
for Subgroups of Restricted E.I. Ranges 

















(1) (2) (3) (4) 
Range of High Per Centof __r (Aptitude vs. 
School E.I.’s N TotalGroup College Marks) 
1 Above 105 78 40 .620 
2 95-104 41 21 816 
3 Below 95 77 39 .304 
4 Above 120 46 24 .634 
5 80-119 114 58 .661 
6 Below 80 36 18 .178 
7 70-129 147 75 569 
8 60-139 167 85 408 
9 40-239 196 100 .369 
Table 3 
Intercorrelations Among Effort Indices * 
Regents College 

E.I. E.I. 

High School E.I............++. .676 .692 

RE ae — .719 





* N’s vary from 168 to 196. 


Effort Indices. The latter cases might be studied more exhaustively in 
an attempt to discover what additional factors should be considered for 
the prediction of their performance. Moreover, the students with low 
E.I.’s probably would derive more than an average degree of benefit 
from educational guidance. 

Whether the Effort Index is to be a useful concept for prediction and 
guidance will, of course, be dependent upon its relative constancy. 
Table 3 provides data in support of the thesis that the E.I. is relatively 





Prediction of College Scholarship 451 


constant. Effort Indices were computed for college marks in relation to 
aptitude scores (College E.I.) and for N. Y. State Regents Examination 
marks in relation to aptitude scores (Regents E.I.). The intercorrela- 
tions between the different Effort Indices were relatively high (.676 
to .719). 


Summary 


Effort Indices, which express the ratio between high school achieve- 
ment and aptitude scores, have been shown to provide a practical means 
of selecting from a class of college freshmen a subgroup whose college 
performance was more than normally predictable on the basis of aptitude 
scores. The more homogeneous a subgroup was with respect to high 
school E.I.’s, the higher was the relationship between aptitude and 
college scholarship. The college freshmen who had high school E.I.’s 
above 80 performed closely in accordance with their relative aptitudes, 
while the performance of students having lower E.I.’s was much less 
predictable. The latter deserve further study, and would presumably 
benefit from educational guidance. The Effort Index was found to be a 
relatively constant measure. 








A School of Nursing Selection Program * 


Blake Crider 
Fenn College, Cleveland, Ohio 


At the end of the school year of 1941-42, the School of Nursing of 
St. Luke’s Hospital of Cleveland, Ohio, had completed a five-year experi- 
ence in the selection of nursing school applicants on the basis of psycho- 
metric tests, interview ratings, and other psychological techniques. This 
experience is reported in the belief that the findings and point of view 
may be of aid to others who are engaged in a personnel selection program. 

In 1936, a battery of psychometric tests! was given to a class of 
28 students already admitted to the School but no particular use of the 
results was made. Eight students finished their training. The follow- 
ing year a similar battery of tests was given to all applicants for the 
nursing program. Not knowing exactly which scores on the tests would 
separate the good risks from the poor, those were rejected who were 
thought to be the greatest risks. In a class of 83 only 55 finished the 
program. 

This preliminary experience is cited to indicate that St. Luke’s Hos- 
pital School, like most other schools, consistently lost about one-third 
or more of each class. A writer in the 34th Annual Report of National 
League of Nursing Education says 25 to 50 per cent of student nurses 
withdraw before completing their program. However, the St. Luke’s 
Hospital School has now reached the point where the problem of failure 
is largely eliminated. 

The most valuable experience obtained from the preliminary psycho- 
metric testing was that test data were found to be of little value without 
a personal interview by the psychologist so that the test data might be 
evaluated clinically. In psychology, as in medicine, contact with the 
individual is essential if the maximum value from test information is to 
be obtained. 


In 1939, the psychologist began to interview all applicants. Not 


* Special appreciation is due to Miss Hazel Goff, Director of the School of Nursing, 
whose interest in nursing education made this study possible. 

1 Tests used were: Otis Self-Administering Test of Mental Ability, Higher Form; 
Schorling-Clark-Potter Arithmetic Test, Form A, Revised; Nelson-Denny Reading Test 
for Colleges and Senior High Schools, Form A; Strong Vocational Interest Blank for 
Women; and Bell’s Adjustment Inventory for Students. 


452 





A School of Nursing Selection Program 453 


knowing exactly what ability was required for nursing success it was 
tentatively assumed that the applicant should have an Otis I.B. of 110, 
be able to read as well as a typical college freshman, and have at least 
ninth grade ability in arithmetic. It was assumed, furthermore, that 
the applicant should make a favorable score on the personality test and 
show a high rating on the interest inventory. Before applying, the 
applicants had’ been partially screened by requiring them to have an 
average of 85 or more in their high school work and to have satisfactorily 
completed certain basic courses. 

Effort was made in the interview to assess four personality qualities: 
Is the applicant emotionally and socially mature? Can she adapt to a 
hospital atmosphere and routine? Is she emotionally stable? Is she 
aggressive without being domineering? 

A two page data and interview sheet was used for recording informa- 
tion. The first page was in the form of a ten-step profile where all the 
psychometric scores could be recorded as well as the subjective appraisal 
of the applicant’s four personality qualities. On the second page ample 
space was provided for noting special strengths and special weaknesses. 
Space was also available for summarizing the totality of the psycholo- 
gist’s clinical judgment gained from the test information and the personal 
interview. 

At the bottom of the second sheet a final over-all rating was recorded. 
This was designed to encompass succinctly opinion about the applicant. 
The applicant was thus rated as very poor, poor, only fair, good, very 
good. Finally, one of three recommendations was recorded: not recom- 
mended; conditionally recommended; and recommended. The final 
rejection or acceptance of a conditionally recommended applicant was 
generally made in a committee: conference. 

The information on the data and interview pages together with the 
results of the applicant’s physical examination, high school record, and 
a personality rating from an acquaintance were sent to the Director of 
the School for use in conducting a final interview and evaluation. 

Each applicant spent an entire day in the hospital to complete her 
examination. On the average, about five or six applicants were examined 
each day. A summary of the results is presented in Table 1. 

Including the first 83 applicants who were tested but not interviewed 
by the psychologist, 360 students completed at least one year in the 
School of Nursing. This number includes also those who completed two 
years or were graduated. In the group as a whole, the academic failure 
rate was 5 per cent plus a loss rate of 6.9 per cent due to lack of interest 
or personality difficulties. The marriage of 2.5 per cent of the students 
brought the psychological loss rate to approximately 14 per cent. Loss 





fp ee ee 





454 Blake Crider 


by marriage should not be charged against the personnel selection pro- 
cedure, since there is no way to anticipate marriage. That these students 
were acceptable candidates is indicated by the fact that their average 
in all tests was slightly above the average of the students that remained. 

An analysis of Strong’s Interest Inventory showed that 90 per cent 
of the applicants had an “A” rating in nursing interest. However, 85 
per cent of those students who left the School for reasons other than 
marriage had an ‘‘A” rating in nursing. As a selective or a prognostic 
device the interest inventory did not prove to be significant. It is safe 
to assume, therefore, that practically all applicants for admission to the 
School are interested in being nurses. The personality inventory, like- 











Table 1 
Data on 506 Nursing School Applicants and Students 
Mean Scores 
Per 
N Cent Otis I.B. Arith. Read. Total * 
1. Rejected 
a. Psychologically........ 70 13.8 99.3 52.5 62.5 180.9 
b. Medically............. 37 7.3 
i, LK + cincnendces shivs 399 78.9 112.0 67.0 85.0 220.2 
SS . a6 cacrshveseacee 360 871.1 112.1 67.0 85.3 219.7 
4. Finishing 1 year plus........ 307 = 85.8 112.6 67.5 87.3 224.0 
5. Leaving 
GD acs whi dpitons 18 5.0 102.2 53.7 63.6 191.5 
b. Porsonality...........: 25 6.9 112.2 59.6 88.8 219.5 
0 Arr 9 2.5 115.3 67.7 888 229.3 
ic bdaetacuneece 4 





* A total score results from adding the student’s score on vocabulary and arithmetic 
to the LB. 


wise, contributed little to the selection process. Of all the students who 
left, only three showed indications of emotional instability on the inven- 
tory, and these were only slightly maladjusted. 

It was originally believed that an applicant should have an I.B. of 
110 and on the profile sheet any I.B. of 107 to 115 was rated as average. 
However, applicants were accepted in the 90-94 range and some appli- 
cants were rejected with I.B.’s as high as 120. Reference to Table 1 
shows the average I.B. of the rejected, the accepted, of those who entered, 
and those finishing one year or more, as well as the average of those who 
left.. Here, as in the other data, the rejected applicants were inferior to 
those who actually failed. This furnishes a partial answer to the often 
asked question, ‘“‘What about those girls who are rejected?” It is evident 
that the failure rate would have been higher had those rejected been 





A School of Nursing Selection Program 455 


accepted. Nevertheless, a number of the rejected applicants could have 
finished the course. 

Since averages mask individual variations it is necessary to elaborate 
the findings. Six of the 20 students under 100 I.B. failed. There were 
341 students with I.B.’s above 100 but only 12 failed. Only two of the 
220 students above 110 I.B. failed. This refers specifically to academic 
failure, although it is often difficult to separate personality factors that 
determine failure from personality factors that are the result of failure. 
However, an analysis of losses because of personality problems shows that 
these students were of approximately equal ability to those that entered 
and indicates that the loss was probably due to personality problems 
rather than to a lack of ability. 

In this connection, the ever-recurring problem of deciding where to 
draw the line for purposes of admission must be faced. If the line is 
drawn at 110 I.B. academic failures are practically eliminated, but at 
the same time many applicants who would have finished the program 
would have been rejected. Therefore, 110 is used as a point of reference 
rather than as a mechanical point of rejection. It is here that the ex- 
perience of a seasoned clinician comes into play since human beings 
cannot be catalogued to the last decimal point; psychometric data must 
be evaluated psychodynamically if psychologists are to rise above the 
level of technicians. 

Scores on the arithmetic tests present a unique problem because 
arithmetic performance is something that can be improved easily with 
review. In fact, all the girls who rated low on the arithmetic tests were 
warned that they should review their arithmetic before entering if they 
wished to avoid difficulties later. Consequently, other things being 
equal, applicants who made low arithmetic scores were given a chance to 
improve their test scores. For this reason, arithmetic test scores did 
not prove to be as prognostic as the I.B. and reading scores. In spite 
of this, however, the average score of the failures on the arithmetic test 
was well below that of the successful students. For example, there were 
27 students with scores below 50, one-third of whom failed. However, 
two students with only fifth grade arithmetic ability on entrance to the 
School succeeded in the nursing program. Had a rigid line been drawn 
at a score of 70, there would have been only two academic failures. But 
here, as with other scores, if a high critical score had been set, there would 
have had to be many more applicants than were available in order to 
make a class and at the same time many applicants would have been 
unjustifiably rejected who are able to complete satisfactorily the nursing 
course. For this reason, a flexible policy was adopted in assessing the 
significance of psychometric scores. 





456 Blake Crider 


The scores on the reading test of those who failed in classwork were 
definitely bunched below the average of those who finished one year or 
more. No student who failed had a reading score equivalent to the 
average of the class. The successful students on the average were 
reading as well as second semester college freshmen whereas the failures, 
on the average, were reading only as well as first semester high school 
seniors. However, it should be pointed out that two students with only 
eighth grade reading ability were successful in. the School of Nursing. 

I.B., arithmetic, and vocabulary scores were combined into a total 
score. The scores of the rejected applicants ranged from 125 to 280; 
those leaving because of personality difficulties, from 170 to 275; and 
class failures from 145 to 210. Had these scores been available when the 
selection program was begun and had the lowest ten per cent been 
rejected arbitrarily, class failures would have been reduced by 50 per cent. 
This procedure would also have eliminated two girls who subsequently 
left because of personality difficulties. However, had the lowest nine 
girls on the total score been rejected, the five girls who remained one 
year or more would have been included. Nevertheless, if the School 
wishes to disregard the feelings of rejected applicants, academic failures 
can be practically eliminated by setting an appropriate critical point on 
the total scores. 

Beginning with the second year of the testing program, the psycholo- 
gist interviewed all the applicants as soon as the psychometric data and 
the high school records were available. Twenty-eight of these inter- 
viewed applicants left because of class failure or because of personality 
difficulties. This represents a ten per cent loss after the interview 
started. More interesting, however, is the fact that interview notes 
made at the time of application anticipated difficulty for 21 of the 28 
applicants. Following are the recorded comments made on a few of 
these applicants. 


No. 347: “Lacking in poise, culture, and sophistication. Is ill at ease. 
Should be satisfactory if the school wants to devote its time to developing 
her personality.”” (Sent home because of her personality.) 


No. 483: “Is considerably emotionally involved on a conscious level. Has 
considerable conflict with her family. Recently disappointed in love. 
Now a man-hater. She is basically stable and her present problems are 
only ones of growing up. Isa borderline risk.’ (Withdrew because of 
lack of interest.) 


No. 487: “She is definitely a collegiate type. I do not believe she will like 
nursing. She will have some difficulty fitting into a nursing routine. 
She seems extremely sophisticated.’’ (Withdrew because of lack of 
interest.) 


No. 412: ‘May be too dominant and self-sufficient to fit into routine 
smoothly. An only child.” (Resigned to go to college.) 





ee 


A School of Nursing Selection Program 457 


No. 418: ‘““Recommended to applicant that she stay in teaching. I do not 
feel nursing is what she wants.” (Resigned after two weeks.) 


No. 68: ‘‘Very immature; can do work if she grows up. Mother rejection 
and strong father attachment. Would like to take her on an experi- 
mental basis.” (Withdrew because of lack of interest.) 


The data are clear, therefore, in showing that it is possible to eliminate 
almost all losses except those due to marriage and illness. All that 
would be necessary is to set a minimum critical score in intelligence, 
arithmetic, and reading and refuse to take all those who give evidence of 
being personality problems. This type of selection, however, is a one- 
sided affair since the School merely assures its own successes and neglects 
a large number of those who could have succeeded. Perhaps this can 
be done in industry when there is an oversupply of labor, but it is not an 
appropriate procedure for an educational institution. Final selection 
should be left to the clinical interview whereby each girl’s application is 
evaluated individually and humanely rather than by arbitrarily defined 
minimum test scores. 

Success in nursing is a constellation of traits: I.B. scores, reading 
and arithmetic ability, high school grades, adaptability, emotional ma- 
turity, interest and so on. These traits can fit together kaleidoscopically 
in many different patterns each of which may lead to success in a school 
of nursing. These patterns cannot be statistically pigeonholed but must 
be assessed anew in each evaluation interview. Among the many possi- 
bilities is an applicant with a high I.B. rating and low high school grades; 
a girl with low ability but high grades; a girl with low ability, low grades, 
but with poise, maturity, and drive to succeed as a nurse; a girl with 
high ability and good academic performance but without a great drive 
to succeed. Each of these individuals may be equally successful in a 
school of nursing. 

In considering an applicant’s ability to succeed, the ability to com- 
pensate for revealed deficiencies must not be overlooked. The depth of 
interests, the drive for a particular goal, and the basis for this drive, are 
just as important as test results. Psychodynamics of human variability 
are as instrumental in causing failure in the brilliant as success in the 
dull. It is the ability of the interviewer to subtend psychometric data 
that makes the clinical interview an indispensable unit in a personnel 
selection program. 











An Experimental Study of the Pl (‘‘Plodding’’) 
Characteristics of Persistence * 


Patricia Palmer Roach 
Chicago, IUinois 


For some time the concept of persistence as a specific rather than a 
general trait has been predominate. Thus, in 1939, G. R. Thornton 
published a report on a factor analysis of tests designed to measure 
persistence (16). His findings showed two factors to be closely related 
to persistence. These he designated as WD (Withstanding discomfort) 
and Pl (Patient plodding). The first of these carries high loading on 
tests which have been rather thoroughly explored by Howells (9) and 
others. The second, however, has received little attention. According 
to Thornton, persons possessing much of this (Pl) factor “are charac- 
terized . . . by a greater willingness to spend time in accomplishing a 
task.” 

It is the purpose of this study to measure and analyze the P! factor 
in relation to general ability, grades, self-sufficiency, extroversion- 
introversion, and other measures. 


Orientation—A Brief History 


The first attempts to measure persistence were in the form of sub- 
jective ratings. Attempts were made by several experimenters to link 
persistence with anatomical classifications but correlations between the 
two were found to be near zero. Of the efforts to relate persistence to 
handwriting, the most complete attempt is found in Downey’s The 
Will-Temperament Tests (4). There has been a great deal of discussion 
and criticism of the reliability and validity of these tests with different 
investigators getting widely different results (5), (10). 

Rating scales were next used to measure persistence. Their use in 
this connection started with Webb and Lankes in 1915. More recently 
Morgan and Hull (11) got good results in a qualitative rating of subjects 
while they were learning a maze. 

The actual testing of persistence started with Fernald and his volom- 
eter (6). Several experimenters, including Fernald, Burtt (1), and 


* This study was conducted in the Department of Psychology, Ohio University, 
under the direction of Dr. James P. Porter. 


458 





“‘Plodding” Characteristics of Persistence 459 


Chapman (2), used single tests in an attempt to measure persistence. 
Then the trend shifted to using batteries of tests. 

In 1929, Hartshorne, May, and Maller (7) published a book entitled 
Studies in service and self-control which was the most extensive survey of 
persistence tests attempted up to that time. Later work in this direc- 
tion was done by Henninger (8), Howells (9), Clark (3), and others. 

The next step in the study of persistence was the use of factor analyses 
of the scores of various tests. In 1938-1939 Ryans (12-13) gave seven- 
teen situations to forty subjects and, after making a factor analysis of 
the scores, found four scores closely related to persistence. His historical 
review of the measurement of persistence (15) is an excellent survey and 
includes a complete bibliography. 

Thornton (17) criticized Ryans’ work on the basis of too few subjects, 
omissions in statistical work, and on selection of tests. He himself did 
a factor analysis of some thirteen test scores, which, with additional 
material (age, etc.), gave twenty-two scores. He found five common 
factors including the WD and PI factors already mentioned. 


Selection of Subjects and Experimental Procedure 


Two groups of college students, one rated high and one rated low on 
the Pl trait, acted as subjects in this experiment. They were rated as 
follows: Typewritten instructions, giving the purpose of the study and 
defining the trait, were given to the heads and, in some cases, officers of 
several housing units (2-4 raters per unit). The raters were asked to 
write down the names of those within their houses who possessed highest 
and lowest degrees of the trait and to make their selections independently 
of one another, disregarding all traits (personality, intelligence, etc.) 
except this one. 

The final list of subjects included only those who had been named by 
at least two raters; thirty in the high group, and thirty in the low group. 
There were twenty-five men and thirty-five women; eleven men in the 
high and fourteen in the low group. The proportional number of men 
and women in this experiment, as well as the proportional number in 
each class (freshman, etc.) corresponds to the total registration in the 
university. 

Each subject came in for two sessions. At the beginning of the first 
session, he was put to work on the first word-building test. This was 
followed in order by studying ‘‘The Damper,”’ Word Building Test II, 
and ‘‘The Damper” test. Following completion of ““The Damper’’ test, 
the subject was thanked for his cooperation and asked if he would come 
in again to finish a few things. 

The second session included giving Perceptual Ability test I, the 
motor inhibition test, Perceptual Ability test II, and the verbal recog- 














460 Patricia Palmer Roach 


nition test in the order named. In each case the directions were gone 
over with the subject and his work was closely supervised. 

Later, all subjects were given the Bernreuter Personality Inventory 
by someone within their housing unit who was enrolled in a psychology 
course at the time. They were told that the test was being used by that 
individual for his course work. 


The following tests were used in this experiment: 


(1) Word Building Test I: This test was getwoed after that used by Chap- 
man (2) and others. The subject was given five minutes to build as many words 
as he could from the five letters given. Two measures were derived from this 
test: the total number of words built, and the average words per minute. 

(2) Word Building Test II: This test was similar to the first one except that 
the subjects understood that there was no time limit. Actually an arbitrary 
limit of thirty minutes was set, and those few subjects who worked that long 
were stopped. Three scores were used for this test: total number of words 
built, average number of words per minute, and total time. 

(3) Verbal Recognition Test: This test was a check list which included the 
seventy words that could be built according to the rules of the two word- 
building tests, and thirty nonsense syllables composed of the same letters. The 
subjects were ‘Nateaiel to put X in front of words and 0 in front of the com- 
binations of letters that were not words. The score was the number of words 
correctly marked with an X, and the time spent on this test was recorded. 

(4) Study Time: This measure was a modified form of that used by Ryans 
(14). However it was administered ot gabomrae | and the subjects were timed 
without their knowing it. The material selected for study was ‘““The Damper” 
(from A book of characters, compiled and edited by Richard Addington). The 
score was the number of minutes the subject devoted to studying. 

(5) Study Time Test: This test consisted of four essay questions on ‘The 
Damper.” A time limit of ten minutes was arbitrarily set for answering the 
— and the time was recorded for those subjects who worked less than 


(6) Perceptual Ability Test I: This test was a modified form of the Harts- 
horne, May, and Maller “Stories Test,” and was patterned after that used 
by G. R. Thornton. 

(7) Perceptual Ability Test II: This test was similar to the first perceptual 
ability test. However it shaded off into nonsense after the first twenty lines. 
Three scores were used: the average number of lines per minute for the first 
twenty lines, the total time spent on the story, and the per cent of the total 
time spent after the first twenty lines. 

(8) Motor Inhibition Test: This test was a modified form of the Downey 
Will-Temperament test. The final score used was a ratio of the number of 
segments traced on the second trial to the number traced on the third trial, the 
first trial being for practice. Thus, the higher the ratio, the greater the 
inhibition. 

(9) The Bernreuter Personality Inventory: The six parts of the inventory 
were scored and used as measures. 

(10) Percentile scores on the Ohio State University Psychological Tesi. (OCA) 
forms 18, 19, 20, and 21 were obtained for each subject and were used as a 
measure of the students’ general college ability. 

(11) Point-Hour Ratio (PHR) was determined for each subject. This was 
found by dividing each subject’s total honor points by his total number of 
hours credit. Honor points were determined by assigning three points credit 
for a grade of A, two points for B, etc. 


“‘Plodding” Characteristics of Persistence 461 
Findings of This Study 


Means, standard deviations, and coefficients of variation were deter- 
mined for two groups of subjects, one rated high, and one low on the Pl 
(Patient Plodding) factor, and for the total group on all measures. 
Critical ratios of the differences in mean scores were also found. These 
measures may be found in Tables 1, 2, 3, 4, and 5. 

Inasmuch as large and highly significant differences in favor of the 
group rated high on Pl were found for mean scores on all parts and total 
of the Ohio State University Psychological Examination (OCA) (Table 1), 
two smaller groups of subjects were selected from the original by match- 


Table 1 


Mean Scores, Standard Deviations, Coefficients of Variation, and Critical Ratios of the 
Groups Rated High (N = 30), and Low (N = 30) on Pl and Total Group on 
0.C.A. Total Score, O.C.A. Part I (Vocabulary), O.C.A. Part IT (Analogies), 
O.C.A. Part III (Paragraph reading), and Point Hour Ratio 








Coefficient of 


Means Variation 





Low High} Low | Total 





O.C.A. Total Seore 48.7 29 | 33 | 41 


O.C.A. Part I 
(Vocabulary) . . 37 49 


O.C.A. Part IT 
(Analogies) , ; 24 40 


O.C.A. Part III 
(Paragraph 
reading) 75.57 | 47.03 | 61.30 49 | 44 | 4.733 


Point-Hour Ratio 58} 33) .68/) 1.98; 1.00; 1.49 33 | 46 | 7.538 















































ing on total OCA scores and sex. There were fourteen subjects (eight 
men and six women) in each of the new groups. The same formulae 
used for the larger groups were applied here except that the modified 
small-sample formula was used to determine the standard deviation of 
the difference. Tables 6, 7, 8, 9, and 10 show the data found for these 
smaller groups. 

Significant differences were found for the larger groups on mean point- 
hour ratios (Table 1); mean scores on Perceptual Ability II, lines/time 
and per cent of time spent after the first twenty lines (Table 3); Verbal 
Recognition scores and time (Table 4); Study Time Test scores (Table 4); 
Word Building Test I, total words (Table 2); and Word Building Test II, 





462 





Patricia Palmer Roach 


Table 2 
Mean Scores, Standard Deviations, Coefficients of Variation and Critical Ratios of the 
Groups Rated High (N = 30) and Low (N = 30) on Pl and Total Group on 
Word Building I (words/minutes), Word Building I (total words), Word 
Building II (words/minutes), Word Building II (total words), and 
Word Building IT (total time) 



























































i f 
e Means Variation | Criti- 
Tests cal 
Ratios 
High | Low | Total | High | Low | Total | High} Low | Total 
Word Building I 
(words/minutes) .58 45 57 | 1.37 .99| 1.18] 43 | 45 | 48 | .2836 
Word Building I 
(total words) 2.78 | 2.56 | 2.87 | 7.0 4.38| 5.92) 40 | 53 | 48 |2.309 
Word Building I 
(words/minutes) 48} .53| .49] 1.63] 1.64] 1.64/ 29 | 32 | 30 | .0758 
Word Building IT 
(total words) 3.83 | 4.24 | 4.34 | 20.57/| 17.5 | 19.03) 19 | 24 | 23 |1.22 
Word Building II 
(total time) 5.17 | 3.92 | 4.65 | 13.29) 11.57} 12.43| 39 | 34 | 37 | .2346 
Table 3 


Mean Scores, Standard Deviations, Coefficients of Variation and Critical Ratios of the 
Groups Rated High (N = 30) and Low (N = 30) on Pl and Total Group on 
Perceptual Ability I (lines/time), Perceptual Ability IT (lines/time), Perceptual 

Ability II (total time) and Perceptual Ability II (per cent of total time 
spent after first twenty lines) 





























° Means CVariation | Criti- 
Tests cal 
Ratios 
High | Low | Total | High | Low | Total | High| Low | Total 
Perceptual Ability I 68| .62 68; 3.11} 2.84] 2.97| 22 | 21 | 23 | 1.588 
(lines/time) 
Perceptual Ability II 
(lines/time) 1.33] 1.00} 1.33] 3.64| 2.38] 3.01| 36 | 42 | 44 | 4.09 
Perceptual Ability II 
(total time) 6.87| 6.66) 6.86) 17.9 | 15.52| 16.71) 38 | 42 | 41 | 1.345 
Perceptual Ability IT 
% total time spent 
after Ist 20 lines | 18.75 | 22.34 | 24.03 | 59.36 | 35.04 | 47.20| 32 | 64 | 51 | 4.462 





























total words (Table 2). 


When the differences in OCA scores were ruled 


out (smaller groups), significant differences in means were found to exist 
for the following: Point-Hour Ratio (Table 6); Perceptual Ability Test I, 





“‘Plodding’’ Characteristics of Persistence 463 


Table 4 


Mean Scores, Standard Deviations, Coefficients of Variation and Critical Ratios of the 
Groups Rated High (N = 30) and Low (N = 30) on Pl and Total Group on 
Study Time (minutes), Study Time Test Score, Verbal Recognition Test 
Score, Verbal Recognition Test Time and Motor Inhibition 

(2nd trial/3d trial) 








Coefficient of 


Means Variation 





Low High Total 





Study Time 
(minutes) : 5. j 42 t 44 





Study Time Test 
Score 











. Verbal Recognition 
Test Score 





Verbal Recognition 
Test Time 





Motor Inhibition 2d 
trial/3d trial 4 .26| 1.02 



































Table 5 


Mean Scores, Standard Deviations, Coefficients of Variation and Critical Ratios of the 
Groups Rated High (N = 30) and Low (N = 30) on Pl and Total Group 
on the Bernreuter Personality Inventory Items B,-N, B.S, B;-I, 
B.D, F,-C, and F.-S 








Coefficient of 
Means Variation 


Bernreuter 





High Low | Total Low High} Low | Total 





B,-N, Neurotic 
Tendency 28.99 | 26.51 | 27.79 43.87 60 | 67 





B,-S, Self-Sufficiency | 27.91 | 29.31 | 28.09 | 51.3 | 42.37 60 





B;-I, Introversion- 
Extroversion 30.51 | 26.34 | 28.58 | 40.1 | 44.43 | 42.27 0.65 





B,D, Dominance- 
Submission 30.37 | 26.87 | 28.69 | 58.43 | 56.43 | 57.43 48 0.265 





F,-C, Self-Confidence | 33.7 | 29.33 | 31.19 | 42.5 | 52.37 | 47.43 56 0.657 








F,-S, Sociability 28.26 | 31.77 | 30.25 | 50.1 | 44.17 | 47.13 72 0.751 
































lines/time (Table 8); Verbal Recognition time and score (Table 9); and 
Study Time Test score (Table 9). 








i 464 Patricia Palmer Roach 


i Table 6 
{ Mean Scores, Standard Deviations, Coefficients of Variation and Critical Ratios of the 
i Groups Rated High (N = 14) and Low (N = 14) on Pl and Matched for Total 
O.C.A. Scores on O.C.A. Total Score, O.C.A. Part I (Vocabulary), O.C.A. 
Part II (Analogies), O.C.A, Part III (Paragraph Reading), 
and Point-Hour Ratio 


















































: Means | enlent of 
- wie 
| High Low High Low High | Low 
i 
) O.C.A. Total Score 19.02 19.57 58.43 57.5 33 34 17 
O.C.A. Part I 
(Vocabulary) 23.55 24.26 50.29 58.79 47 41 1.28 
: O.C.A. Part IT 
(Analogies) 19.16 19.55 65.07 61.79 29 32 61 
O.C.A. Part III 
(Paragraph Reading)} 21.39 21.14 60.43 53.57 35 39 1.16 
Point-Hour Ratio 49 3 1.69 1.09 29 28 5.5 
Table 7 


Mean Scores, Standard Deviations, Coefficients of Variation, and Critical Ratios of the 
Groups Rated High (N = 14) and Low (N = 14) on Pl and Matched for Total 
O.C.A. Scores on Word Building I (words/minutes), Word Building I 
(total words), Word Building II (words/minutes), Word Building II 

(total words), and Word Building II (total time) 















































Coefficients of 
od Means Variation ie 
Tests ation 
High Low High Low High | Low 
Word Building I 
(words/minutes) 53 .30 1.24 1.09 43 28 1.17 
Word Building I 
(total words) 2.52 2.82 6.07 4.86 42 58 1.64 
Word Building II 
(words/minutes) .30 59 1.50 1.65 20 36 1.15 
Word Building II 
(total words) 4.06 3.89 18.93 19.06 21 20 06 
Word Building II 
(total time) 3.74 4.11 13.16 12.74 28 32 .39 





All of the above differences, with the exception of Verbal Recognition 
score, were in favor of the group rated high on Pl. The superiority of 
the “low” group on this item was attributed to guessing, as it was found 








“‘Plodding’’ Characteristics of Persistence 465 


Table 8 


Mean Scores, Standard Deviations, Coefficients of Variation and Critical Ratios of the 
Groups Rated High (N = 14) and Low (N = 14) on Pl and Matched for Total 
O.C.A. Scores on Perceptual Ability I (lines/time), Perceptual Ability II 
(lines/time), Perceptual Ability II (total time) and Perceptual Ability IT 
(per cent of total time spent after first twenty lines) 








Coefficients of 


Means 


Variation 





High 


High Low 





Perceptual Ability I 
(lines/time) 


2.71 


19 21 





Perceptual Ability II 
(lines/time) 


2.79 


24 





Perceptual ability II 
(% total time spent 
after Ist 20 lines) 


52.08 





Perceptual Ability IT 
(total time) 








8.16 





18.39 

















Table 9 


Mean Scores, Standard Deviations, Coefficients of Variation and Critical Ratios of the 
Groups Rated High (N = 14) and Low (N = 14) on Pl and Matched for Total 
O.C.A. Scores on Study Time (minutes), Study Time Test Score, Verbal 

Recognition Test Score, Verbal Recognition Test Time and 


Motor Inhibition (2nd trial/3d trial) 








Means 


Coefficients of 
Variation 





High 


High Low 


Critical 
Ratios 





Study Time 
(minutes) 


12.94 


48 





Study Time Test 
Score 


51.64 


51 





Verbal Recognition 
Test Score 





Verbal Recognition 
Test Time 


2.00 


6.12 


4.7 


37 


2.78 





Motor Inhibition 
(2nd trial/3d trial) 





27 





27 








1.02 


91 








26 30 





1.57 





that this group falsely recognized 3.1 words more on the average than 
did the “high” group. This difference was highly significant. 
Apparently the difference in means for point-hour ratio cannot be 












466 Patricia Palmer Roach 





Table 10 


Mean Scores, Standard Deviations, Coefficients of Variation and Critical Ratios of the 
Groups Rated High (N = 14) and Low (N = 14) on Pl and Matched for Total 
O.C.A. Scores on the Bernreuter Personality Inventory, Items B,-N, B.-S, 

: B;-I, B.D, F,-C, and F.-S 





























M Coefficients of 
Bernreuter 7 za ee Critical 
“ Ratios 
High Low High Low High Low 
B.-N, Neurotic : 

Tendency 31.24 28.63 34.29 43.07 93 66 1.05 
B,-S, Self-Sufficiency | 26.12 25.79 49.21 35.43 53 73 1.92 
B;-I, Extroversion 32.31 27.07 33.79 39.00 96 69 .63 
B.-D, Dominance 34.35 25.84 56.43 60.64 61 43 50 
F,-C, Self-Confidence} 37.31 29.33 35.14 49.86 | 106 59 1.58 
F,-S, Sociability 25.19 30.26 46.71 32.14 54 94 1.89 


























explained by differences in ability as indicated by OCA scores. This 
may be considered a long-term measure, and the difference in mean 
scores may be taken as an indication of lack of “‘persistence’”’ on the part 
of those rated low on Pl when working toward remote goals. 

Study Time Test scores showed a complete lack of correlation with 
the time spent on studying the material (0.05 + .07) and with OCA 
scores (—0.01 + .10). This indicates that the differences found here are 
not due to general ability or to the time spent in studying. 

Neither of the differences in mean scores of lines/time for the two 
Perceptual Ability tests was found to be significant for both the large 
and small samples. However, as both scores measure the same thing 
(speed of reading difficult material), and as their differences were found 
to be significant in at least one of the groups, the scores may be con- 
sidered valid for differentiating between the two extremes of the trait. 

The difference in mean scores on Perceptual Ability Test II, per cent 
of total time spent after the first twenty lines, was highly significant for 
the large groups and had a fairly high critical ratio on the small ones 
(Table 8). Apparently this score differentiates fairly well between those 
rated high and low on the Pi trait. Scores on this measure correlated 
0.59 + .06 with OCA scores. Moreover the correlation between lines/ 
time and per cent of total time spent after the first twenty lines for the 
second perceptual ability test was 0.49 + .06. The indication is that 
those who possess ability to do a task spend more time on it. 

The differences in mean scores of the large groups for total words on 





— 0 2 SS ct 


_— 


oo © 


“‘Plodding’”’ Characteristics of Persistence 467 


the two word building tests dropped out when OCA was held constant 
(Table 7); but the difference in means for Verbal Recognition time re- 
mained the same (Table 9). As this measure shows no correlation with 
OCA, it may be assumed that it is a rather clear cut measure of the 
Pl factor. 

Differences in mean scores of the “high” and “low” groups on Motor 
Inhibition were, insignificant for both samples. Moreover, no significant 
correlations were found to exist between this measure and other measures 
included in this study. 

No significant differences in mean scores on the Bernreuter Person- 
ality Inventory existed for the large groups (Table 5). In the matched 
groups, those rated low on PI tended to be less self-sufficient and more 
social than those rated high (Table 10). These differences are only 
slightly significant and, in view of the large standard deviations for the 
groups, may be attributed to chance. All other differences in mean 
Bernreuter scores for these matched groups are insignificant. 


Summary 


(1) Other things being equal, the following measures may be used to 
differentiate between those possessing a high degree and those possessing 
a low degree of the Pl (Patient Plodding) trait: Point-Hour Ratio, 
lines/time on Perceptual Ability Tests I and II, per cent of time spent on 
Perceptual Ability Test II after the story becomes nonsense, Verbal 
Recognition Time, and Study Time Test Score. 

(2) There is a positive relationship between an individual’s ability 
to do a task and the time he is willing to spend on it. 

(3) There are no significant differences between those possessing a 
high degree and those possessing a low degree of Pl in neurotic tendency, 
self-sufficiency, extroversion, dominance, self-confidence, or sociability as 
measured by the Bernreuter Personality Inventory. 


Bibliography 


. Burtt, H. E. Measuring interest objectively. Sch. & Soc., 1923, 17, 440-448. 
2. Chapman, J.C. Persistence, success and speed in a mental task. Ped. Sem., 1924, 

31, 276-284. 

. Clark, W. H. Two tests of perseverance. J. educ. Psychol., 1935, 26, 604-610. 

. Downey, J. E. The Will-Temperament and its testing. Yonkers, N. Y.: World 
Book, 1923. Pp. 333. 

. Downey, J. E., and Uhrbrock, R. 8S. Reliability of the group Will-Temperament 
tests. J. educ. Psychol., 1927, 18, 26-39. 

. Fernald, G. G. An achievement capacity test: a preliminary report. J. educ. 
Psychol., 1912, 3, 331-336. 

. Hartshorne, H., May, M. A., and Maller, J. B. Studies in service and self-control. 
New York: Macmillan, 1928. Pp. 552. 





10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 








Patricia Palmer Roach 


. Henninger, L. L. A comparative study of some measures of persistence. Unpub- 


lished. M. A. Thesis, Ohio University. 1932. 


. Howells, T. H. An electrical stimulus apparatus. Amer. J. Psychol., 1928, 43, 


122-123. 

Kennedy, F. The practical value of the June Downey Will-Temperament tests. 
Brit. J. Psychol., 1934, 4, 260-263. 

Morgan, J. J. B., and Hull, C. L. The measurement of persistence. J. appl. 
Psychol., 1926, 10, 180-187. 

Ryans, D.G. An experimental attempt to analyze persistent behavior: I. Measur- 
ing traits presumed to involve “‘persistence.” J. Gen. Psychol., 1938, 19, 333-353. 

Ryans, D. G. An experimental attempt to analyze persistent behavior: II. A 
persistence test. J. Gen. Psychol., 1938, 19, 355-371. 

Ryans, D.G. Some observations concerning the relationship of time spent at study 
to scholarship and other factors. J. educ. Psychol., 1938, 30, 372-377. 

Ryans, D. G. The measurement of persistence: An historical review. Psychol. 
Bull., 1939, 36, 715-739. 

Thornton, G. R. A factor analysis of tests designed to measure persistence. 
Psychol. Monogr., 1939, 51, 1-42. : 

Thornton, G. R. How general is the factor of “persistence”? A re-examination 

and evaluation of Ryans’ results. J. Gen. Psychol., 1940, 23, 185-189. 





A Reply to Dr. Luckiesh 
Miles A. Tinker 


University of Minnesota 


The allegations of Dr. Luckiesh in his “comments”’ (2) on my review 
(9) of his and Dr. Moss’ book (3) cannot be ignored. It is true, as stated, 
that I did not cite supporting evidence in the review since it is not 
customary. In this reply, however, I am giving the citations. 

Dr. Luckiesh’s plea that authors have the right to establish their 
own ground rules and cite only material from their own laboratory may 
be justified in presenting a summary of a specific program of research 
but less so in a systematic work in which principles for practice are 
outlined as in Reading as a visual task. Such a procedure is inexcusable 
when experimental findings of others are not in agreement with the 
author. Tinker (8, 32 titles) in discussing illumination, and Pyke (7, 180 
titles), Vernon (10, 23 titles), and Paterson and Tinker (6, 103 titles) 
in writing on typography,! have cited the pertinent reports, analyzed the 
data and indicated the trend of the evidence. 

Luckiesh, in his ‘‘Comments,”’ states that an admission that visibility 
does not always guarantee high readability is merely a statement of fact. 
If this is so, then he should not employ visibility measures as criteria 
for printing practice as he does on page 128 (3) where visibility is set up 
as a criterion. The authors attempt unconvincingly to justify them- 
selves by stating that in many cases it is not practicable to obtain other 
measures and furthermore the visibility measures are readily obtainable. 
As pointed out in my review, other factors than mere ability to perceive 
printed symbols accurately are involved in readability.. This was dem- 
onstrated by Webster and Tinker (11) in their study of type faces. 
Furthermore, Vernon (19, pp. 160-178) has assembled data which demon- 
strate that working with disconnected words, as in the study of visibility, 
yield data which are not valid when applied to normal reading. 

Luckiesh further states that Tinker accepts rate of reading as a suit- 
able criterion of readability without suspicion or proof and that no 
studies have been published giving proof. Here I refer him to page xv 
of How to make type readable (6) where he will find a definite statement 
concerning the limitations of rate of reading as a criterion of readability, 


1 Note also reference to detailed analysis in technical reports, p. xvi-xvii (6). 
469 















1 A OREO we 





Sp ne ORT tee Bm 


eee 


LE Ne RENE RN ARETE NATE AIRE RE Ore I 


) 

; 

) 

) 

a 
: 


470 Miles A. Tinker 





and to pages 160-189, Appendix I, where proof of the suitability of rate 
of reading as a measure of readability, its sensitivity, and the controls 
employed are given in detail. Furthermore, the claim by Luckiesh that 
How to make type readable (6) “is unaccountably devoid of analyses of 
his data as to reliability” and that “‘probable errors are notably absent”’ 
only indicates that Dr. Luckiesh has failed to read Appendix II of the 
book where tables of detailed results and location of the tables are listed. 
Furthermore, I refer Dr. Luckiesh to the 13 “Studies of Typographical 
Factors Influencing Speed of Reading” published in the Journal of 
Applied Psychology, 1928-1936. It might be added that Dr. Luckiesh’s 
view that statistical analysis is completed by including the probable 
errors is not well taken. In most instances he fails to make use of these 
to compute the significance of discovered differences. Dr. Luckiesh will 
find that every experimental report by the writer includes complete 
statistical evaluation. 

Dr. Luckiesh’s researches on the relation between blink-rate and ease 
of reading do not necessarily become “axiomatic” (2) to others merely 
because he claims them to be self-evident and because he has done nearly 
forty studies using the technique. As to his challenge that I reveal any 
published experimental results which question his findings (conclusions), 
I respectfully refer him to McFarland, Holway and Hurvich’s extensive 
Studies of visual fatigue (5). After a searching analysis of their own 
extensive experiments and of other studies (including some by Luckiesh) 
these authors say ‘“‘A high blink-rate need mean neither an increase in 
fatigue nor an increase in difficulty of seeing” (p. 85), and conclude that 
“The rate of blinking can hardly be considered as a valid index of visual 
fatigue” (p. 86). This would seem to raise a question concerning the 
validity of the blink-rate technique as a measure of ease of seeing. 

Contrary to Dr. Luckiesh’s claim, other work has been done on heart 
rate during reading. The experiment by McFarland, Knehr and Berens 
(4) was designed to check the findings obtained in Luckiesh’s laboratory. 
The results led to the conclusion that “It is questionable whether reliable 
criteria for determining adequate levels of illumination for tasks such as 
reading during short periods of time (approximately 2 hrs.) can be ob- 
tained in terms of . . . heart rate... .” 

Dr. Luckiesh offers no adequate justification for the questionable use 
of the geometric rather than the arithmetic mean. Kelley’s (1, pp. 65- 
66) discussion of the geometric mean certainly does not suggest that it is 
applicable to Luckiesh’s data. 

Dr. Luckiesh fails to attempt any answer to my criticisms on ( 1) his 
application of data on news-type to book-type, (2) his mis-statements on 
printing practice, (3) inferences without data, and (4) inconsistencies of 








A Reply to Dr. Luckiesh 471 


interpretation. As to his accusation that I am prejudiced and lack 
information, one may point out that lack of agreement does not prove 
prejudice and that Luckiesh or any one else is not immune from criticism 
merely because he has spent a lifetime in some field of research. Most 
scientists will agree that research should be evaluated in terms of its 
merits, not in terms of how long the experimenter has been working 
in the field. 

Contrary to Luckiesh’s statement, he did not make use of ‘facts 
which are available’ in specifying typography for optimum readability 
(6, 7, 10). He only employed inadequate material derived from his own 
laboratory. 

I am well aware that an author is forced to compromise with his 
publisher. Paterson and Tinker (6) had to do so with reference to paper 
stock, type-form for authors’ names in bibliography, and margins em- 
ployed in their book. All other typographical factors follow our recom- 
mendations. To note the fact that Luckiesh and Moss’ book did not 
follow their specifications is not a criticism. As a matter of fact the 
typography used is better than if their specifications had been employed. 

To write authoritatively in the field of reading it is necessary for the 
author to have an adequate knowledge of the fundamental principles of 
reading. The authors of Reading as a visual task reveal a lack of this 
knowledge, especially in the field of perception and eye movements. 
Nevertheless, as stated in my review, there is much of real value in their 
book. 


Received July 22, 1943. 


References 


. Kelley, T. L. Statistical method. New York: Macmillan Company, 1924. Pp. 
369. 

. Luckiesh, Matthew. Some comments on Dr. Tinker’s review of Reading as a visual 
task. J. appl. Psychol., 1943, 27, 360-362. 

. Luckiesh, Matthew and Moss, F. K. Reading as a visual task. New York: D. Van 
Nostrand Company, 1942. Pp. 428. 

. McFarland, R. A., Knehr, C. A., and Berens, C. Metabolism and pulse rate as 
related to reading under high and low levels of illumination. J. erp. Psychol., 
1939, 25, 65-75. 

. McFarland, R. A., Holway, A. H., and Hurvich, L. M. Studies of visual fatigue. 
Boston: Graduate School of Business Administration, Harvard University (Sol- 
diers Field), 1942. Pp. 255. 

. Paterson, D.G.,and Tinker, M.A. How to make type readable. New York: Harper 
and Brothers, 1940. Pp. 209. 

. Pyke, R.L. The legibility of print. London: His Majesty’s Stationery Office, 1926. 
Pp. 123. 

. Tinker, M. A. Illumination standards for effective and comfortable vision. J. 
consult. Psychol., 1939, 3, 11-20. 








472 Miles A. Tinker 


| 9. Tinker, M.A. Review of Reading as a visual task by M. Luckiesh and F. K. Moss. 

J. appl. Psychol., 1943, 27, 116-118. 

| 10. Vernon, M. D. The experimental study of reading. London: Cambridge Uni- 

versity Press, 1931. Pp. 160-178. 

11. Webster, H. A., and Tinker, M. A. The influence of type face on the legibility of 
print. J. appl. Psychol., 1935, 19, 43-52. 





NEWS AND NOTES 


Dr. Willard C. Olson, Secretary of the American Psychological Asso- 
ciation, Inc., reports the election of the following officers at the Septem- 
ber meeting of the Association: President, Dr. Gardner Murphy, College 
of the City of New York; Member of Council of Directors: Dr. C. M. 
Louttit, on leave of absence from Indiana University serving in the 
Bureau of Naval Personnel, and Dr. Donald Marquis, Yale University; 
Members on the National Research Council: Dr. Edna Heidbreder, 
Wellesley College; Dr. Ernest R. Hilgard, Stanford University; and 
Dr. Willard C. Olson, University of Michigan; Member on the Social 
Science Research Council, Dr. Harold E. Jones, University of California; 
and Editor of the Journal of Abnormal and Social Psychology, Dr. Gordon 
W. Allport, Harvard University. Dr. Willard C. Olson was re-elected 
Secretary. An appropriation of $10,000 was voted for the support of the 
Office of Psychological Personnel through the National Research Council. 





Dr. Carroll L. Shartle, Chief of the Occupational Analysis Section 
of the War Manpower Commission, announces that a sample occupa- 


tional interview illustrating the use of the Dictionary of Occupational 
Titles and the use of Oral Trade Questions has been recorded on phono- 
graph records by the National Broadcasting Company. This transcrip- 
tion should be useful in teaching courses in personnel psychology and in 
training employment interviewers. 





Mr. Donald H. Dabelstein, Director of the Division of Vocational 
Rehabilitation of the Physically Handicapped, Minnesota Department 
of Education, has submitted the following statement in regard to recent 
federal legislation which will expand the program for the vocational 
rehabilitation of the physically handicapped civilians: 

“A publicly supported program for the vocational adjustment of the 
physically handicapped has been in existence since 1920. However, in 
June 1943 Congress enacted legislation (Public Law 113) to permit 
expansion of the existing state-federal program and to increase the types 
of services to be extended to disabled persons. 

“The revised legislation makes no change in the basic function of 
vocational rehabilitation, namely, to assist persons with permanent 
partial disabilities to select, prepare for and establish, or reestablish 
themselves, in full time employment. Briefly, the services of the new 

473 














2 NE ORR I AY RE og 


oe RN Nearemnen 


ith Beene 


te PORE BT 





474 News and Notes 


program will provide for vocational counseling, training, maintenance 
during training, and needed medical attention. 

“The act authorizes the Federal Administrator to make studies, 
investigations, and reports with respect to the abilities, aptitudes, and 
capacities of disabled individuals, and to conduct or secure courses of 
instruction for any of the personnel participating in the rehabilitation 
program. The length of training, however, for professional personnel is 
limited to six weeks but includes cost of tuition, books, subsistence, and 
traveling expense. 

“The purpose of vocational rehabilitation is similar to that of other 
vocational adjustment programs. As such, the success of individual 
rehabilitation depends in large measure upon the choice of an appropriate 
occupational objective or the quality of vocational guidance. Rehabili- 
tation differs only in that it requires a continuing responsibility from the 
time of referral until an adequate occupational adjustment is obtained. 

“In spite of the fact that guidance, counseling or personnel work 
plays a major role in the rehabilitation program, together with the fact 
that much of the counseling or personnel work is psychological in nature, 
nevertheless during the past twenty-three years only a few states have 
engaged personnel with applied psychological or personnel background. 
It is hoped that provision for an expanded civilian rehabilitation program 
will not only arouse the interests of applied psychologists, but also that 
the services of trained clinical personnel workers will be absorbed in the 
program. Opportunities should be available for persons trained in indi- 
vidual diagnosis in relation to vocational adjustment, vocational coun- 
seling, job analyses, research, and tests and measurements.” 





The attention of applied psychologists and departments of psychology 
is called to the soldiers rehabilitation law passed by Congress on March 
24, 1943. Provision in the law is made for the vocational rehabilitation 
of veterans of the present war whose employability has been lost by 
virtue of a handicap due to service-incurred disability. The provisions 
of the law are similar to the expanded vocational rehabilitation program 
for physically handicapped civilians noted above. Of especial interest 
to psychologists is the provision that the administrator is empowered to 
make or cause to be made “studies, investigations and reports inquiring 
into the rehabilitation of disabled persons and the relative abilities, 
aptitudes, and capacities of the several groups of the variously handi- 
capped and as to how their potentialities can best be developed and their 
services best utilized in gainful and suitable employment, including the 
rehabilitation programs of foreign nations engaged in the present war.” 





News and Notes 475 


The National Opinion Research Center at the University of Denver 
has issued Report No. 9 dated June 1943, which contains opinion poll 
results on a number of issues involved in the reconversion period from 
war to peace. Copies of the report can be obtained from the National 
Opinion Research Center for 10 cents to cover, in part, printing costs and 
postage. Periodic reports issued by the Research Center can be obtained 
by becoming a subscribing member with dues of $2.50 per year. Non- 
commercial libraries can obtain the periodic reprints at $1.50 per year. 





The May issue of the Bulletin of the Menninger Clinic, Topeka, 
Kansas, is devoted to five contributions dealing with clinical psychology 
in the psychiatric clinic. Copies may be purchased for 15 cents each. 





Psychology for the fighting man has been recently issued by the Na- 
tional Research Council and the Infantry Journal in the Penguin Book 
Series. The book, which contains 456 pages, is based on the contribu- 
tions of 59 collaborators. The contents comprise twenty chapters which 
constitute an elementary textbook on psychology written directly for the 
soldier regardless of rank. 

This book is not a symposium, but is a unified textbook achieved by 
having Dr. Edwin G. Boring of Harvard University and Margery Van de 
Water of Science Service assume responsibility for putting the materials 
in final form. The work itself was undertaken by a committee of the 
National Research Council with the collaboration of Science Service as 
a contribution to the war effort. It tells all about military psychology. 
Although written in popular form there has been no sacrifice of its scien- 
tific accuracy. It is a major contribution of psychology to the war effort. 





The News Bureau of Ohio University, under date of August 5, 1943, 
reports that Dr. James P. Porter, editor of the Journal of Applied Psy- 
chology from 1921 to 1942, was honored at Ohio University when a new 
loan fund to be known as the James P. Porter Loan Fund in Psychology 
was established. Upon his retirement from active teaching on July 30, 
1943, the Board of Trustees of the University elected Dr. Porter to the 
position of Professor Emeritus of Psychology. 











ee 


we 


Ree 


Se ie St teed helen eee 


We PERE he get 


RAR IGN EINE I: She 





ERAT ot I = RO ATOR PR AEE 


Book Reviews 


Stoddard, George D. The meaning of intelligence. New York: The 

Macmillan Co., 1943. Pp. 504. 

As the title implies, this book is a discussion of the author’s conception 
of intelligence, and of various factors which influence intelligent behavior. 
The book consists of five parts: I, The Nature of Intelligence; II, The 
Measurement of Intelligence; III, Growth in Intelligence; IV, Heredity 
and Environment; V, Intelligence and Society. Stoddard has evidently 
read widely; but the informed reader will quickly discern that the refer- 
ences are highly selective and are usually for the purpose of defending a 
very definite point of view. 

The tone of the book is well conveyed by the “definition” of intelli- 
gence given on page 4 of Chapter I. Stoddard writes: “Intelligence is 
the ability to undertake activities that are characterized by (1) difficulty, 
(2) complexity, (3) abstractness, (4) economy, (5) adaptiveness, (6) social 
value, and (7) the emergence of originals, and to maintain such activities 
under conditions that demand a concentration of energy and a resistance 
to emotional forces.”” There is certainly little that one could possibly 
add to this amazing congeries, which is, of course, no definition at all 
but a general summary of the 1921 symposium. I am quite sure that 
no tests now available, or likely to be available, could possibly encompass 
Stoddard’s definition. Certainly even its most rabid partisan has never 
believed that the Stanford-Binet runs the gamut of human behavior. 

Apparently Stoddard is none too happy with his definition as he tries 
several times later on in the book to pin it down to earth. The most 
intelligible attempt, it seems to me, is on page 285 where Doll’s state- 
ment that intelligence is only a part of ‘mentality’ is quoted. Doll is 
contending that in the evaluation of a child’s general level one must 
consider, in addition to abstract ability, social and emotional behavior, 
and also motor (and perhaps mechanical) skills. Stoddard says that he 
agrees with this view which, it may be noted, is essentially Thorndike’s 
notion of “three kinds of intelligence.”’ 

It will soon be abundantly clear, even to the casual reader, that one 
of Stoddard’s purposes—if not his major purpose—is to discredit and 
demolish the IQ. The “inconstancy” of the IQ is the recurring theme 
in Parts I-III. To be sure, Point Scales receive a mild chastisement in 
Chapter 5, but the brunt of the assault (Chapter 4) is against the Stan- 
ford-Binet and the IQ. The evidence against IQ constancy is skillfully 

476 


















































a 





EN SEL PETS TT 


Book Reviews 477 


martialed and is often cleverly distorted, as will be pointed out later. 
More important as giving the polemic and non-scientific character of the 
book is the frequent use of ridicule, attempted wit, exhortation, special 
pleading, and specious argument. The following are samples: p. 50, 
“The whole village knows the idiot, but academies of science have wel- 
comed men whose schemes for differentiating IQ 70 from IQ 170 were 
quite worthless”; p. 115, ‘‘Point scoring should reduce the fanatic attach- 
ment that some persons have for a fixed quantity that can be predicted, 
subsidized, and inherited—the modest virtues of the IQ’’; p. 258, “‘the 
IQ, some say, is fixed: what varies is the relationship of an individual to 
it, such variation being a product of invalidity in the test and idiosyn- 
crasy in the child—given a perfect test and an ever-normal child the 
IQ would be constant!’’; p. 120, where Thurstone’s criticism of the use 
of IQs for adults is made to appear to be a criticism of IQs in general; 
p. 110, the futile and involved criticism of the vocabulary test as a 
measure of intelligence by the use of false premises and questionable 
logic. It is significant that throughout Stoddard’s discussion of the age- 
scale and the IQ, almost nothing is said concerning the care with which 
the Stanford-Binet was constructed; the adequacy of its sampling; the 
heroic efforts made to insure item validity; the careful determination of 
the PE of an IQ through which are computed the inevitable fluctuations 
to be expected in the IQ; the many factors to be considered before a fair 
estimate of 1Q can be obtained or the IQ itself evaluated. On the whole, 
Chapter 4, “Translations of Binet Tests’”’ constitutes, I am sure, one 
of the most misleading and unfair treatments of the Binet-type test ever 
written. 

Stoddard’s technique in Parts I-III is to set up a straw man and then 
lustily lambaste it. A very convenient straw man is the proposition 
that the IQ is a fixed, invariable measure of native ability. I doubt if 
any experienced psychologist today seriously believes this or thinks that 
IQs never change. The IQ is essentially a statistical device through 
which a child’s performance on a carefully selected array of tests is 
compared with the performance of white American school children of 
his own age, born and reared under ordinary conditions of American life. 
The Stanford-Binet is one of the few tests so constructed that a child’s 
position with reference to his group (his IQ) can remain constant within 
its PE of measurement. The really important fact then is not the change 
in IQ itself, but the reasons for the change. Does anyone believe that 
the feebleminded child who shows an increase of 20 points in IQ after 
several months of good food and kind treatment in an institution is 
“really” smarter? Or that the bright youngster whose IQ drops from 
150 to 122 (Old Stanford-Binet) as he reaches his late ’teens has “‘really’”’ 
deteriorated? Marked changes in IQ up or down are not facts either to 








— NN eee 


ee ee eee 





ee 


Eee ae ———— 


ee 
































478 Book Reviews 


bemoan or rejoice over. Rather such shifts are a challenge to the 
Examiner. Is the change due to errors of measurement, to drastic 
changes in the environment, or to changes within the child himself? 
The first two seem the more plausible. Certainly studies of intensive 
practice have given little evidence of persisting changes in the level of a 
child’s performance. And I think we should all agree that one cannot 
conclude that a change has taken place in the child’s “‘true” ability until 
all possible extrinsic causes have been investigated and eliminated. 

Stoddard’s thesis is that of the Iowa group, namely, that intelligence 
can be increased by training and that the IQ, in consequence, is neces- 
sarily “inconstant.’”’ The evidence is carefully slanted towards ‘“‘prov- 
ing” this contention. I shall cite one example, to illustrate what I mean 
by “slanting.”” A frequency polygon taken from P. Cattell is reproduced 
on p. 221 showing the changes in IQ (Stanford-Binet, 1916) upon retest 
after varying time intervals. The figure shows 3331 comparisons ob- 
tained from about 1200 children over a 7-year period. Extreme IQ 
changes range from +50 to —50 points, the bulk of the changes, how- 
ever, falling between +20 and —20. Stoddard remarks, after quoting 
a statement from Cattell, that this result “is perhaps typical of IQ 
variations for large samplings based on the 1916 Stanford Revision.” 
What the reader would not know from this discussion are the following 
facts: 

1. Two or more examinations were administered to each child, the 
time intervals between tests varying from a few months to 7 years. 
Many children had more than two examinations and the practice effect 
must have been considerable (admitted by Cattell). 

2. Cattell found a tendency for bright children to gain in IQ upon 
retest and for dull children to lose. Cattell points out also that included 
in her data are the records of many bright children who were tested 
several times. Both of these conditions would tend to exaggerate IQ 
changes and automatically lengthen the range. 

3. There were 60 different examiners. Stoddard writes that “while 
Cattell appears to feel that tests given by different expert examiners will 
cause a spurious variation in measurement, there is no evidence for this 
in her report.’”’ On the contrary, the evidence is clear (given on p. 616 
of Cattell’s report) that the “personal equation” of the examiner makes 
a difference of 13 IQ points, on the average, and may make a difference 
of 20-25 points. 

4. Taking the PE of an IQ to be approximately 4 points, we may 
expect a change of +16 or more IQ points to occur at least once in 100 
times from errors of measurement alone. If to test unreliability is added 
the “personal equation” of the examiner, changes of 20 or more points 
in IQ are hardly unexpected. But in spite of such variations, it may be 


Book Reviews 479 


noted that only 3 per cent of all comparisons yielded IQ differences of 
more than 20 IQ points. 

In short, by a change in emphasis and the presentation of further 
data, one could easily argue for a rather remarkable exhibition of IQ 
constancy from Cattell’s data. The same may be said of other evidence 
submitted by Stoddard. 

Part IV deals with heredity and environment. Here a rather amazing 
reversal of position takes place. The IQ, previously demolished, is now 
reinstated and refurbished, and becomes the means of “proving” the 
all-importance of environment. The Iowa studies of Wellman e¢ al. are 
presented to “‘prove”’ that a child’s IQ (and presumably his intelligence) 
can be dramatically elevated through nursery school training. The work 
of Skeels and Skodak is cited to “‘prove”’ that children of low grade or 
feeble-minded parents when placed at an early age in a good foster home 
exhibit IQs on the average above 100 (as high as 140). In one group of 
such children, only 4 in a total of 87 had IQs below 80. Practically all 
of the work quoted is out of Iowa. Leahy’s study of foster children 
(perhaps the best of its kind) is described as “‘puzzling’’; L. S. Holling- 
worth and H. E. Jones are not cited; ‘doubt is cast”’ upon the Minnesota 
studies of Goodenough and others because of “selective factors’; Burks’ 
work is omitted in toto, as is also Woodworth’s recent critical monograph. 
Almost nothing is said about the difficulty of obtaining valid IQs for 
adults; of getting fair measures from the mothers of illegitimate children; 
and of making certain of the ‘‘true” fathers of these children. 

I admit that an author is privileged to present his own point of view; 
and even to select data which will support it. But doesn’t he also have 
the obligation of giving his readers a fair deal? Shouldn’t the reader at 
least be told what criticisms have been leveled at the Iowa work by 
competent psychologists? It happens that I agree with McNemar and 
Goodenough (whose criticisms of the Iowa work are nowhere mentioned) 
that the Iowa investigators have committed almost every experimental 
and statistical error in the calendar, as well as the unforgiveable sin of 
being well-nigh unintelligible in their presentation. And I think it is 
little short of criminal to suggest, even by implication, that no matter 
what a child’s parentage may be one can count on his being ‘‘normal””— 
or even bright—if only he has a “good” home. 

There are those who hold that a book should never be unfavorably 
reviewed; that error inevitably falls of its own weight. In an ideal 
sense, perhaps, this may be true. But I feel that it is the duty of a 
reviewer to accelerate the demise of an erroneous view insofar as he is 
able. He may at least give it a gentle push. 


Henry E. GARRETT 
Columbia University 














NOD ytee> Hs 





New Books, Monographs, and Pamphlets 


The psychology of efficiency. Arthur G. Bills. New York: Harper & 
Brothers, 1943. 

Psychology for the fighting man. E. G. Boring and M. Van de Water, 
et al., for a Committee of the National Research Council. ~ Washing- 
ton, D. C.: The Infantry Journal. Penguin Special $212, 1943. 
Pp. 456. $.25. 

Mysticism in modern psychology. Charles Carle. New York: Psycho- 
Sociological Press, 1943. $1.00. 

Mechanical methods for increasing the speed of reading. Eloise B. Cason. 
New York: Bureau of Publications, Teachers College, Columbia Uni- 
versity, 1943. Pp. viili+ 80. $1.75. 

Hypnotism. G. H. Estabrooks. New York: E. P. Dutton & Co., Inc., 
1943. Pp. 249. $2.50. 

The Chicago Mental Growth Battery: Ten tests of graded difficulty for the 
study of intellectual development. Frank N. Freeman and M. A. 
Wenger. Chicago: The University of Chicago Press, 1943. Pp. 58. 
$1.00. 

Outlines of research in general speech. Howard Gilkinson. Minneapolis: 
Burgess Publishing Company, 1943. Pp. 80. $1.75. 

Doctor in the making. Dr. Ham and Captain Salter. Philadelphia: 
J. B. Lippincott Company, 1943. $2.00. 

The readability of certain type sizes and forms in sight-saving classes. 
Harold J. McNally. New York: Bureau of Publications, Teachers 
College, Columbia University, 1943. (Contributions to Education 
No. 883.) Pp. 71. $1.75. 

The psychology of military leadership. L. A. Pennington, R. B. Hough, 
Jr., H. W. Case. New York: Prentice-Hall, Inc., 1943. Pp. ix + 
288. 

William McDougall, M.B., D.Sc., F.R.S.: A bibliography. Compiled by 
Anthony L. Robinson. Durham, North Carolina: Duke University 
Press, 1943. Pp. 54. $1.50. 

The writing of infrequently used words in shorthand. Clyde Eugene Rowe. 
New York: Bureau of Publications, Teachers College, Columbia Uni- 
versity, 1943. Pp. viii+90. $1.60. 

Discovering ourselves. Edward A. Strecker, Kenneth E. Appel, John W. 
Appel. New York: The Macmillan Company, 1943. (2nd ed.) 
Pp. xvii + 306. $3.00. 


480 





New Books, Monographs, and Pamphlets 481 


Test yourself for a war job. S. Vincent Wilking and Dorothy J. Cush- 
man. Boston: Houghton Mifflin Company, 1943. Pp. 137. $1.50. 

The expression of personality. Werner Wolff. New York: Harper & 
Brothers, 1943. Pp. 334. $3.50. 

Emotion in man and animal. Paul Thomas Young. New York: John 
Wiley & Sons, Inc., 1943. Pp. xiii +422. $4.00. 

Annual report of the Social Science Research Council 1941-1942. New 
York: Social Science Research Council, 1943. Pp. 73. 

Guidance manual for the high-school victory corps. Federal Security 
Agency, and U. 8. Office of Education, Victory Corps Series Pamphlet 
Number 4. Washington, D. C.: Superintendent of Documents, 
Government Printing Office, 1943. Pp. 37. $.20. 

















AMERICAN PSYCHOLOGICAL PERIODICALS 





American J KA Dallenbach, sage pa ak N.Y.; Cornell Universit rene on may 624 sannually. 
taken ce sae K Dallenbach adison Bentley, and E, C: Boring Guewel wok experi- 


mental psychology 
jana of Genetic -—Provincetown, M nn kommen? Press. Subscri: 14.00 
2 volumes). Glascanmediiy. wales be ‘Cart Murchison. “Quarterly iy. Cute bekevlen an 
havior, and paced on re psychology. Founded 1 


Review— Northwestern University, Evanston, Iie Auta Peli Ant, Inc. 
Pere Subecrpton $550, Pac Seely. Edited by Herbert S. Langfeld. - General 


University , Evanston, 


Monographs—Northwestern SE er ey eee . 
. Subscription ae Se ee Edited by John F. Dashiell, Wi ¢ fined dates, 
one or more researches, 


Northwestern University, Evanston, Illinois; American Psychological Association, Inc. 


rs het Sem - | ae annually. Edited by John’ E. Anderson. Monthly (10 numbers). 


Ration, eee York, N. ¥.; Columbia Universi! Subscription $6.00 volume. 
= y R.S. Woodworth. Whikous tend desea, cual caisibertsataghe ecxamiimenectaede udy. A br 
ov they ona eh se Tw. 560 = annually. Edited by Gordon W. 
m, imc. pages 
Quarterly. Founded 1906. 
jel heels Fela bie teen nad Oh dee tically maeee teen 
i ‘ ones. t 
p moore Founded yb: A 3 nee 
Review—New N. Y.; oe Subscription $6.00. 500 pages annuaily. 
Tedited by Smith Ely Jellitie’” Quarteity, Founded 1 r 


area ° of pile gy Ip ern University, Evanston, Illinois; American cal 
as $14.00 0 er pon (2 volumes). 1040 pages annually. Edi by 
aw W. Fernberger Monthly. F, 


ournal of Applied Psychology—Northwestern University, Evanston, Illinois; American Psychol 1 Asso- 
J a ty Subscription $6.00. 480 pagesannually. Edited by Donald G. Paterson. Bi-monthly. 
‘ounded 


‘ournal of i 2m ge Psychology—Baltimore, Md.; Williams & Wilkins Co. Subscription $14.00 ag annum 
5 (2 1000 pagesannually. Edited by Roy M. Dorcus. Bi-monthly. Founded 192 


Comparative Psychology M Baltimore, Md.; Williams & Wilkins Co. Subscription $6.00 per 
A re a, pages. by Roy M. Dorcus.’ Without fixed dates, each number = Waele novell + 


"Ealted by Cork phs—Provincetown, Mass.; The rnal Press. Subscriptio 00. 500 
“sana hub. He Pa amen Bi-monthly. i number one ren mh ghee i Cilia 
parative: psychology. Founded 1925. 


wh Stbecription $7.0. 00 pagesannually Opagesanaually. Edited ,_ bited by Walter. Hunter and Bay ery Monthly. 


ournal of General Piychslegy—Provinestown, Malas ournal Press. Subscription $14.00 
J (2 TR ea Ale 1000 pom & ca ome be Baited by, Car Coil Marchiaon Quarterly. erent, thee 


‘ournal of Social Provincetown, The Journal Press. Subscription $7.00. 500 pages 

J : a mae by ohn Dewey and Carl aye el tlg Quarterly. Political, racial, and differential 
psychology. 

Psychoanalytic Quarterly—. Y.; 372-374 Broadway. Subscription $6.00. 560 pages annually. 
Edited by Bertram D. BADR E Beis Quarterly. Founded 1932. 


one Peete Snaee Durham, N.C,; Duke University Press. Subscription $2.00. 360 pages annually. 
Edited by Karl Zener. Quarterly. * Founded 1932. 


Joma eer ee Coitieabioe ggg She wo mgs hd anal mama 


Psychometrika—University of Chicago, Ii; 2 ln tgge coe tng ‘Subscription $10.00. 320 
eta A VEalited by Leb. Thurstone ad Quantitative methodsin ‘pipehelans: 


Psychological Recoré—Bloomington, Ind.; Princi Q . Subscription $4.00. 500 ieomeally P ear 
by J .R. Kantor and C. M. Louttit. Wine nm fixed dates, each number a 
psychology. Founded 1937. 


Journal of Penn.; Science Printing Co. Subecri 00. 240 pages 
Ne edhe Gey PY Baan thaascltie "Founded 1037, aparties 





EE 


een 





